Google 



This is a digital copy of a book that was preserved for generations on Hbrary shelves before it was carefully scanned by Google as part of a project 

to make the world's books discoverable online. 

It has survived long enough for the copyright to expire and the book to enter the public domain. A public domain book is one that was never subject 

to copyright or whose legal copyright term has expired. Whether a book is in the public domain may vary country to country. Public domain books 

are our gateways to the past, representing a wealth of history, culture and knowledge that's often difficult to discover. 

Marks, notations and other maiginalia present in the original volume will appear in this file - a reminder of this book's long journey from the 

publisher to a library and finally to you. 

Usage guidelines 

Google is proud to partner with libraries to digitize public domain materials and make them widely accessible. Public domain books belong to the 
public and we are merely their custodians. Nevertheless, this work is expensive, so in order to keep providing this resource, we liave taken steps to 
prevent abuse by commercial parties, including placing technical restrictions on automated querying. 
We also ask that you: 

+ Make non-commercial use of the files We designed Google Book Search for use by individuals, and we request that you use these files for 
personal, non-commercial purposes. 

+ Refrain fivm automated querying Do not send automated queries of any sort to Google's system: If you are conducting research on machine 
translation, optical character recognition or other areas where access to a large amount of text is helpful, please contact us. We encourage the 
use of public domain materials for these purposes and may be able to help. 

+ Maintain attributionTht GoogXt "watermark" you see on each file is essential for informing people about this project and helping them find 
additional materials through Google Book Search. Please do not remove it. 

+ Keep it legal Whatever your use, remember that you are responsible for ensuring that what you are doing is legal. Do not assume that just 
because we believe a book is in the public domain for users in the United States, that the work is also in the public domain for users in other 
countries. Whether a book is still in copyright varies from country to country, and we can't offer guidance on whether any specific use of 
any specific book is allowed. Please do not assume that a book's appearance in Google Book Search means it can be used in any manner 
anywhere in the world. Copyright infringement liabili^ can be quite severe. 

About Google Book Search 

Google's mission is to organize the world's information and to make it universally accessible and useful. Google Book Search helps readers 
discover the world's books while helping authors and publishers reach new audiences. You can search through the full text of this book on the web 

at |http : //books . google . com/| 




Pm 



/ ,0 

//3/ 






' '^ ^9^S4 



Educational Diagnosis of Individual 

Pupils 

A Study of the Individual Achievements of Seventy- 
Two Junior High School Boys in a Group of 
Eleven Standardized Tests 



By 
CHESTER A. BUCKNER 



Submitted in partial fulfillment of the requirements for the Degree 
of Doctor of Philosophy in the Faculty of Philosophy, 

Columbia University 



Published by 

(Ef«rlrfr« (EiiUfgr, dUttstmfita IniiifrattB 

SEW YORK CITY 

1919 



Copyright, 1919, by Chester A. Buckner 



ACKNOWLEDGMENTS 

Fob the use of data secured and for the privilege of securing 
additional data concerning the achievements of certain pupils 
in the Speyer School of Teachers College, I am indebted to 
Professor Thomas H. Briggs. I am grateful to the teachers of 
this school and to Dr. E. K. Fretwell for their cooperative aid in 
administering the tests. 

For the suggestion of the field of research and for helpful 
supervision of the work my obligation to Professor Briggs is 
also gratefully acknowledged. To Professor George D. Strayer 
I am likewise indebted for constructive criticism. 

Because of her devoted interest and untiring assistance in the 
prosecution of this study I am most grateful to my wife, Neva 
Starrett Buckner. 

C. A. B. 



Ill 



356893 



CONTENTS 

SECTION PAGE 

I. The Pboblbm 1 

II. Pbeliminabt Investigation 3 

III. EXPEBIMENTAL MATERIAL AND METHOD 10 

1. The Subjects 10 

2. The Administration and Scoring of the Tests 12 

3. Special Testing 17 

IV. Statistical Treatment 19 

1. Transmutation and Distribution of Scores 19 

2. The Use of Averages and Variabilities 21 

3. Kedistribution of Scores 28 

V. Individual Variability Compared With Group Variabilitt 30 

1. The Amount of Individual Variability 30 

2. Distribution of Individual Variability 37 

3. Overlapping of Divisions of the Group 46 

VI. Extreme Variability in Individual Cases 51 

1. Extreme Variability in Different Tests 51 

2. Extreme Variability of Different Boys 55 

3. Reduction of Variability by He-examination 60 

4. The Causes of Extreme Variability 63 

VII. Correlation Betwioen Measures of Ability and Variability 75 

1. Correlation Between Measures of Ability 75 

2. Correlation Between Measures of Variability 78 

3. Correlation Between Measures of Ability and Variability 79 

VIII. Conclusions 81 

Appendix 86 



INDEX OP TABLES 

NUMBEB PAGE 

I. The 34 Most Erratic Scores Distributed by Quartiles 

according to the Different Classifications 5 

II. The 34 Most Erratic Scores of Each Classification Com- 
pared according to the Number Above and the Number 
Below the Median Score of the Individual 6 

III. Distribution of the 34 Most Erratic Scores among the 

Tests according to the Different Classifications 6 

IV. Summary of Variation in Ranks of the 97 Pupils in the 

Eleven Tests 7 

V. Distribution of Ranges in Ranks of the 97 Pupils in the 

Eleven Tests 7 

VI. Distribution of Scores of the 97 Pupils in the Eleven Tests 
according to S.D. Distance From the Median Score 
of the Individual 8 

VII. Distribution of the 72 Boys among Groups in School II 

VIII. The Tests and the Times at Which They Were Given 18 

IX. The Semi-Interquartile-Range (Q) of the Distribution 

of the Original Scores for Each Test 20 

X. Distribution of Scores Transmuted into Multiples of Q 
Above and Below the Median of the Original Distribu- 
tion in Each Test 20 

XI. Average of the Individual Semi-Interquartile-Ranges in 

the Eleven Tests 31 

XII. Distribution of the Individual Semi-Interquartile-Ranges 

(Approximation) in the Eleven Tests 32 

XIII. Averages in Connection with Individual Ranges in Scores 

Transmuted into Multiples of Q 33 

XIV. Comparison of the Variability of Individual Ranges in 

the Eleven Tests by Quartiles and by Tertiles 35 

XV. Distribution (in Per Cents) of Scores Above and Below 
the Individual Medians in the Eleven Tests Trans- 
muted into Multiples of Q by the Original Distribu- 
tions 38 

XVI. Distribution (in Per Cents) of Scores Above and Below 
the Individual Medians in Certain Tests Transmuted 
into Multiples of Q by the Original Distributions 40 

XVII. The Q of the Distribution of Scores Above and Below the 
Individual Medians for the Three Testings and for 
Certain Tests Combined 41 

XVIII. Distribution by Tertiles of Ranges Above and Below the 

Individual Medians in the Eleven Tests in Values of Q 42 

XIX. Distribution by Tertiles of Ranges Above and Below the 

Individual Medians in Certain Tests in Values of Q 43 

XX. Per Cent of Scores in Each Quartile Above the 75 Per- 
centile and the Median of Each Quartile Higher, and 
the Per Cent in Each Quartile Below the Median and 
the 25 Percentile of Each Quartile Lower 47 

XXI. Difference in Achievement Between Quartiles Measur€4 

in Terms of the Q Variability of the Group 48 

• • 

VII 



NUMHBB 
XXII. 

XXIII. 

XXIV. 

XXV. 

XXVI. 

XXVII. 

XXVIII. 

XXIX. 

XXX. 

XXXI. 

XXXII. 

XXXIII. 

XXXIV. 
XXXV. 

XXXVI. 

XXXVII. 

XXXVIII. 

XXXIX. 

XL. 

XLI. 

XLII. 



Index of Tables 

PAOE 

Number of Scores in the Different Tests 3 Q or More 
Plus or Minus 62 

Number of Scores 3 Q or More Plus or Minus by Tertiles 
and Total for Each Testing 55 

Number of Boys Having Scores 3 Q or More Plus or Minus 
in Each Type of Test in Either One or More Testings 66 

Number of Boys Making Different Numbers of Scores 3 Q 
or More Plus or Minus in All Three of the Testings 56 

Number of Boys Making Scores 3 Q or More Plus or Minus 
and the Number of Scores of Either Type that Each 
Boy Made 68 

Number of Boys Making Scores 3 Q or More Plus, Minus, 
and Plus and Minus in One or More of the Three 
Testings 69 

Number of Boys Having Scores 3 Q or More Either Plus, 
or Minus, or Plus and Minus by Tertiles and Total 
for Each Testing 60 

Comparison of Scores in Original and Special Tests of 
Certain Boys Having Scores 3 Q or More Minus in 
Original Tests 61 

Comparison of Scores of Certain Boys in Special Tests 
with Their Scores in Corresponding Original Tests 63 

The Amount of Q Representing One-Half the Interval of 
the Distributions of the Different Tests 64 

The Variability of Certain Individuals in the Ranking 
of Their Own Achievements for the Different Testings 70 

Teachers' Ratings on Certain Points Concerning the Work 
of Fifteen Pupils 71 

Correlation Between Composite Rankings in Ability 76 

Correlation Between Measures of Variability in the Eleven 
Tests at the Different Times They Were Given 78 

Correlation Between Measures of Ability and Variability 
in the Eleven Tests 79 

Distribution of Scores Above and Below the Individual 
Medians in the Eleven Tests Transmuted into Multiples 
of Q by the Original Distributions 87 

Distribution of Scores Above and Below the Individual 
Medians in Certain Tests Transmuted into Multiples 
of Q by the Original Distributions 88 

Original Scores by Tests and by Individuals. ' Febru- 
ary, 1916 89-90 

Original Scores by Tests and by Individuals. Febru- 
ary, 1917 91-92 

Original Scores by Tests and by Individuals. June, 
1917 93-94 

Original Scores by Tests and by Individuals. Additional 
Tests 96 



Vlll 



INDEX OF FIGURES 

NUliBEB FAOB 

1. Different Forms of Distribution of the Scores of Individuals in 

the Eleven Tests 4 

2. Distribution of the Scores, in Values of Q, of Three Individuals. 

February, 1916 Tests 24 

3a, &, c to 13a, &, c. Distribution of Scores in Each Test Transmuted 
into Multiples of Q Above and Below the Median of the Orig- 
inal Distribution 26-27 

14. Showing the Effect of Two Different Forms of Distribution upon 

the Q 28 

15. Chart Showing the Value in Q of Each Score of the June, 1917 

Tests. ( Insert opposite page 28 ) 28 

16a to 17o. Distribution (in Per Cents) of Scores Above and Below 
the Individual Medians in the Eleven Tests Transmuted into 
Multiples of Q by the Original Distributions 39 

18 to 21. Distribution (in Per Cents) of Scores Above and Below 
the Individual Medians in Certain Tests Transmuted into 
Multiples of Q by the Original Distributions 41 

22a to 23c. Distribution by Tertiles of Kanges Above and Below the 
Individual Medians in the Eleven Tests and in the Eight 
Trabue Tests in Values of Q 44 

24a to 25c. Distribution by Tertiles of Ranges Above and Below the 
Individual Medians in the Six Mathematics and Five Direc- 
tions Tests 45 



IX 



\ 



EDUCATIONAL DIAGNOSIS OF INDIVIDUAL PUPILS 



THE PROBLEM 

Educational diagnosis presents many problems, each with its 
specific implications. In an approach to the present study it 
does not seem imperative to consider either a logical oi^aniza- 
tion of these problems or an exhaustive summary of relevant in- 
vestigations ; however, mention of a few problems and methods 
will assist in the orientation of this study in the general field. 
One phase of educational diagnosis is based upon standardized 
tests and scales, which have been used to study and compare the 
attainments of groups of pupils, usually school grades or school 
systems. The average or median achievement of different groups, 
the extent of overlapping, the amount of variability, and the dis- 
tribution of results have been used as measures for comparison. 
The relation of the attainment of a group in one function to its 
attainment in another function or trait has been studied exten- 
sively and expressed by various formulae for correlation. Results 
obtained by standardized measurements have been compared 
with teachers' judgments, directly by teachers' rankings and in- 
directly by comparison with school marks. Tests of the same 
function or trait have been compared and ranked as to merit. 
Tests of different traits have been compared and ranked as to 
their merit in the evaluation of general intelligence. Mistakes 
made most frequently by the group have been studied. These 
examples of the use of standardized tests suggest that the trend 
of the movement in scientific measurements has been to em- 
phasize the group. 

In these studies and in even more extensive ones now being 
undertaken, the individual even though not lost sight of has 
not received as much attention as the group. It would seem 
that greater emphasis should be placed upon the measurement 
of the individual and the interpretation of results along with the 
measurement of the group and the further development of the 
instrumentalities of measurement. It is the purpose of this 
study to ascertain to what extent and with what degree of relia- 
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2 Educational Diagnosis of Individual Pupils 

bility standardized tests and scales can be used to discriminate 
educational attainments of the individual. Is it possible to diag- 
nose a case and to prescribe specific mental work on the basis of 
achievements in such testp? The following are some of the ques- 
tions to be considered. 

1. How can individual measures of achievement in different 
tests be compared or equated without losing the refinement of the 
original scores? 

2. Do scores of equal value in a given test necessarily have 
the same meaning for two or more individuals? 

3. What is the amount of the individual's variability among 
the different tests? 

4. How are the scores of the individual distributed with re- 
spect to some measure of his central tendency? 

5. How do the bright, the mediocre, and the dull pupils com- 
pare with each other in their variability and distribution of 
achievements ? 

6. To what extent are there extremely variable or erratic 
scores ? 

7. How do the bright, the mediocre, and the dull pupils com- 
pare in the number of erratic scores which they make ? 

8. What are the causes of the extremely variant scores? 

9. What is the relation between different measures of ability, 
between different measures of variability, and between meas- 
ures of ability and variability? 

The specific purpose of this investigation is to determine the 
individual achievements of seventy-two junior high school pupils 
in a group of eleven tests given at three different times during a 
period of a year and a half. The tests have been used to rank 
these pupils in achievement, to determine the amount of varia- 
bility of the group in a single test, and to determine the amount 
of variability of the individual in the eleven tests. That the 
data obtained from these tests are valuable in the general di- 
rection of the work of these pupils has been demonstrated at the 
Speyer School of Teachers College. That such data can be used 
to advantage in the prescription of special work in certain cases 
is a logical assumption. This, however, should be tested by prao- 
tice and by further experimentation. 



II 

PRELIMINARY INVESTIGATION 

The purpose of this section of the study is to answer the first 
question proposed in the statement of the problem, namely: 

ow can individual measures of achievement in different tests 
be compared or equated without losing the refinement of the 
original scores? This section is introduced not only to describe 
a way of equating measures but also to compare two methods of 
equating measures of achievement in different tests and to de- 
termine by which method more reliable results can be obtained. 
Special emphasis is placed upon the classification of extremely 
variable or erratic scores. 

The data for the preliminary investigation consist of the 
scores of ninety-seven seventh grade boys in eleven standardized 
tests and scales given in February, 1916. The tests are : Woody 
Arithmetic Scales, Series A, Multiplication and Division ; Trabue 
Completion-Test Language Scales, Scale B and Scale C ; Thorn- 
dike Reading Scale Alpha 2, Part II ; Thomdike Reading Scale 
A, Visual Vocabulary ; Composition, scored by the Hillegas Scale 
for the Measurement of Quality in English Composition ; Ayres 
Measuring Scale for Ability in Spelling; Woodworth and Wells 
Association Tests, Opposites, Mixed Relations, and Easy Direc- 
tions. The description of the subjects, the tests, and the scoring 
of the tests in Section III, Experimental Material and Method, 
is applicable here and is omitted from this section because the 
chief concern here is the evaluation of methods of statistical 
treatment. 

The ^method used to compare individual measures of 
achievement will' be called Classification by Rank. By this 
method the scores of each test were arranged in frequency tables 
according to the original scores of the papers. The scores were 
then turned into ranks. The highest score was ranked one and 
the lowest score was ranked ninety-seven. In cases of ti^d scores 
the mid-rank of the interval was given to each score. The eleven 
ranks of each individual were assembled and arranged in order 
from highest to lowest. The rank by the original distribution 
^was retained. The variability of a score was then measured 

3 



4 Ediicational Diagnosis of Indwidual Pupils 

by its distance, in terms of ranks by the original distribution, 
from the median rank of the individual. Obviously the ranks 
of an individual could to varying degrees approximate three 
forms of distribution, — a distribution skewed downward from the 
median, a distribution skewed upward from the median, and a 
distribution approximating the normal surface of frequency. 
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Case 92 



Case 65 

Fig. 1. Different Forms of Distribution of the Scores of Individuals in 
the Eleven Tests. 

The cases in Fig. 1 illustrate these forms. These are actual 
cases selected from the group of ninety-seven boys. The scale 
at the left of the plate represents the range in ranks which could 
be obtained in each test. The letters refer to the tests in which 
the ranks indicated were made. The case numbers 60, 65, and 
92 are the serial numbers which these boys chanced to have when 
the names of the group were arranged in alphabetical order. 
These are extreme cases, but only in the sense that they are near 
the limit of the range of the respective forms which they are 
selected to illustrate, and not in the sense that they are markedly 
different from other cases. 
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Results obtained from this method are shown in Tables IV and 

V. These results will be discussed in connection with the results 
from the other method. 

The sec9nd_ method will be called Classification by Standard 
Deviation (S-IXJ? It is like the first method only to the point 
of the frequency tables of the original scores. Using these tables 
the original scores were transmuted into multiples of S.D. The 
scores of each pupil in multiples of S.D. of the original distribu- 

TABLE I 

The 34 Most Erratic Scores Distributed by Quartiles According 
to the Different Classifications 

Quwrtile 
I II III IV 

Classification by Rank 9 6 9 10 
Classification by S.D. 4 8 6 16 

tions were then collected and arranged in order of S.D. value 
from highest to lowest. The variability of a score was then 
measured by its distance, in terms of S.D. by the original dis- 
tribution, from the median score of the individual. The dis- 
tribution of all the scores in multiples of S.D. is given in Table 

VI. The results show that there are thirty-four scores which 
deviate from the medians of the respective individuals by more 
than 2 S.D. 

These two methods of equating scores and determining indi- 
vidual variability will now be compared in order to arrive at 
some basis for choosing the one which will produce the more 
reliable results. In Table I the thirty-four most erratic scores 
in each classification are distributed among the quartiles of the 
group, the quartiles being determined from the median ranks 
of the individuals. Quartile I is the highest. The table reveals 
a rather marked difference between the two classifications. By 
the Classification by Rank the erratic scores are distributed 
quite evenly among the four quartiles. The Classification by 
S.D. produces decidedly the greatest number of erratic scores in 
Quartile IV, while it produces relatively few in Quartile I. 

Another way of comparing these methods is by dividing the 

34 most erratic scores of each classification into the number 

above the median and the number below the median score of 

the individual. This is done in Table II. By this comparison 
iMean Square Deviation. 



6 Educational Diagnosis of Individual Pupils 

the Classification by S.D: differs very decidedly from the Classi- 
fication by Bank. By S.D. the number of erratic scores below 
the median greatly exceeds the number above the median; by 
Bank the number of scores above and the number of scores 
below the median are about equal. 

TABLE II 

The 34 Most Ebbatic Scores of Each Classification Compased 

According to the Number Above and the Numbeb Below 

THE Median Soobe of the Individual 



Above the 


Below the 


Medicm 


Median 


18 


16 


4 


30 



Classification by Rank 
Classification by S.D. 

That these two methods do not affect the same scores in a dif- 
ferent way, as might be inferred from Table II, is shown by the 
fact that of the 34 most erratic scores in each classification only 
eight are common to both classifications. 

The last method of comparing the classifications directly was 
by distributing the 34 most variable scores among the eleven 
tests. The results are brought together in Table III. Here 
also there are some rather marked differences between the two 
classifications. The greatest contraist in the number of erratic 
scores is found in the case of spelling. 

TABLE III 
Distbibution of THE 34 MosT Ebbatic Scobes Among the Tests 

ACGOBDINO TO THE DiFFEBENT CLASSIFICATIONS 






^ 






Clas^fi- |1 |:i I I ll-li i I I II 11 



Rank 


5 


7 


5 


2 


1 


1 


5 


2 


2 


3 


1 
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7 


6 


3 





1 





2 


8 


3 


2 


2 



Indirectly further comparison of the two classifications can 
be made by a study of Tables IV, V, and VI. Table IV shows 
that the average range of pupils who stand highest and of those 
\ who stand lowest is considerably less than that of those of aver- 
age ability. According to range in ranks above and below the 
median score of the individual the four quartiles have close in- 



Preliminary Investigation 



verse relation. The average range above the median of the firsrt 
quartile is approximately the same as the. average range below 
the median of the fourth quartile. The inverse relation holds 
right through to the average range above the median of the 
fourth quartile which is approximately the same as the average 
range below the median of the first quartile. The range between 
the last two scores at each end of the distribution of the individ- 
ual 's scores has an inverse relation smiliar to that of the average 
range above and below the median. In average S.D. the quar- 
tiles have about the same relation as in average range. 

TABLE IV 

8UMKABY OP Variation in Ranks of the 97 Pupils in the 

Eleven Tests 











Av. Interval 










Average Mange 


between Last 
Two Scores 






Av, 


Above 


Below 


Above 


Below 


Av, 






Mange 


Med. 


Med. 


Med. 


Med. 


S.D. 


Quartile 


I 


69.8 


23.3 


46.6 


17.8 


3.2 


22.2 


« 


II 


81.6 


36.2 


45.4 


13.1 


8.2 


26.9 


ti 


III 


78.5 


44.7 


33.8 


6.6 


10.9 


26.4 


it 


IV 


67.6 


46.9 


21.8 


2.3 


15.6 


22.2 



Table V is a distribution of the ranges in ranks. From this 
the median range in ranks, according to the original distribution, 
is found to be 76.6 which is 82 per cent of the maximum possible 
range. Taken at its absolute value this seems to be a high per 
cent. Whether or not its relative value is high must await fur- 
ther investigation. This question will be considered further in 
connection with Table XIII in Section V. 

TABLE V 

Distribittion op Ranges in Ranks of the 97 Pupils in the 

Eleven Tests j 



Value 




in Ranks 


Frequency 


90 to 94 


8 


85 to 89 


17 


80 to 84 


14 


75 to 79 


15 


70 to 74 


13 


65 to 69 


12 



Value 
in Ranks 



Frequency 



60 to 64 
55 to 59 
50 to 54 
45 to 49 
40 to 44 
35 to 39 
30 to 34 



8 
5 
2 
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EXPERIMENTAL MATERIAL AND METHOD 

1. The Subjects 

The subjects for this investigation were a group of boys in 
the Speyer School of Teachers College, Columbia University. 
There were seventy-two individuals for whom complete records 
were secured in all eleven tests in all three testings, February 
1916, February 1917, and June 1917. This school was opened 
as an experimental academic junior high school in February 
1916. The group which entered first, about two hundred in 
all, came from twenty-four classes in five of the public schools 
in New York City, Nos. 5, 10 B, 43, 184, and 186. The seventy- 
two subjects for this study were among this group. 

Before the experimental school was opened the twenty-four 
classes in the public schools were given the following tests: 
Woody Arithmetic, Multiplication Scale, Series A ; Trabue Com- 
pletion-Test Language Scales B and C; Composition, scored by 
the Hillegas Scale; and Ayres Spelling Scale. Soon after they 
entered Speyer School six additional tests were given: Woody 
Arithmetic, Division Scale, Series A; Thomdike Reading Scale 
Alpha 2, Part II; Thomdike Reading Scale A, Visual Vocabu- 
lary ; Woodworth and Wells Association Tests, Opposites, Mixed 
Relations, and Easy Directions. Complete records of the scores 
in all of the eleven tests were secured for ninety-seven of the 
boys entering. These tests will be referred to throughout this 
study as the February 1916 tests. 

For purposes of this investigation it is important to know 
whether the boys are a highly selected group or whether they 
represent the different grades of ability in typical classes begin- 
ning the seventh school grade. Studying these same boys in con- 
nection with a different problem Dr. E. K. Fretwell answers 
this question as follows: **It is noted then that the Speyer 
group is, on the basis of achievements in these five tests, some- 
what better than the other group, though only slightly better. 
It should also be pointed out that this group coming to Speyer 

10 



Experimental Material and Method 11 

did not cluster around the median of achievement and that there 
were all kinds of pupils from the brightest to very nearly the 
dullest. On this point the estimates of the twenty-four teachers 
are in accord with the tests. "^ The five tests referred to are 
those named above which were given before the school was 
opened. The estimates of the teachers were for intelligence 
and industry. A more detailed discussion of this question and 
also a fuller description of the entrance of these boys into the 
Speyer School may be had by consulting the study referred to 
above. 

It is also of importance in connection with this study to point 
out that the boys were divided into groups for the purpose of in- 
struction on the basis of their achievement in terms of their 
average rank in the eleven tests. When the average rank for 
each boy was determined groups of about twenty-five each were 
formed on the basis of achievement in the tests. At any time 
after this the teachers by their combined judgments could make 
any transfers they considered desirable so long as the groups 
were kept approximately the same in size. 

Of the ninety-seven boys who were given the February 1916 
tests seventy-five were in Speyer School in February, 1917, and 
seventy-two in June, 1917, when the collection of data for this 
investigation was finished. The distribution of these seventy- 
two boys among the different groups in June, 1916 and June, 
1917 is shown in Table VII. 

TABLE VII 
Distribution op the 72 Boys Among Gbotjps in School 

Groups 1 

June, 1916 13 

June, 1917 13 

This table shows that the seventy-two boys were quite uni- 
formly distributed among the two hundred and therefore were 
not materially different from typical seventh grad^ boys. 

After the February 1917 tests were given the boys were num- 
bered from 1 to 75 according to the alphabetical arrangement 
of their names. These serial numbers are retained throughout 

iFretwell, E. K., A Study in Educational Prognosis, Teachers Coll^;e, 
Columbia Uniyeraity, Contributions to Education, No. 99. 
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12 Educational Diagnosis of Individual Pupils 

the investigation. Tables XXXIX to XLI in the Appendix con- 
tain the scores for the three testings. In February 1917 com- 
plete records for seventy-five individuals were secured. Three 
boys, Nos. 4, 22, and 37, were not present when the June 1917 
tests were given. Because some of the statistical work had been 
done before the last testing was made the serial numbers were 
not changed to 1 to 72 but instead the three numbers noted 
above were dropped. 

2. The Administration and Scoring of the Tests 

The standardized educational and psychological tests listed in 
Table VIII were used to secure the data for studying the edu- 
cational attainments of the pupils who have been described 
above. The tests and the method of administering them will 
now be described briefly. References to full discussion of the 
tests by their authors are given for those not already familiar 
with these tests who may wish to make further study of them. 

Wood/y Arithmetic Scales ^ 

The Multiplication Scale, Series A, consists of thirty-nine 
problems scaled in degree of difficulty. The first problem is 
so easy that out of 943 seventh grade pupils tested by the author 
of the scale 936 solved it correctly, and the last one so difficult 
that of the same group only 186 solved it correctly. Multipli- 
cation Scale, Series B, is composed of twenty problems selected 
from Series A. It covers practically the same range of difficulty 
as does Series A. 

The Division Scale, Series A, is made up of thirty-six prob- 
lems, the first of which was solved by 822 out of 940 seventh 
grade pupils tested by the author of the scale, and the thirty- 
sixth by 123. Fifteen problems of Series A covering the entire 
range of difficulty compose Series B. 

The time given was sufficient for practically all of the pupils 
to complete the tests. In accordance with the recommendation 
of the author **the standard for marking a problem correct was 
absolute accuracy, and, wherever possible, reduction to its lowest 
terms." One point was given for each correct answer. The 
score for the individual is the number of correct answers. 

2 Woody, Clifford, Measurements of Some Achievements in Arithmetic, 
Teachers College, Columbia University, Contributions to Education, No. 80. 
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Hotz Algebra Scales^ 

The Hotz First Year Algebra Scales were in the process of 
construction when the data for this study were secured. The 
Addition and Subtraction Scale is made up of twenty-four prob- 
lems scaled in degree of difl&culty from easy to diflScult prob- 
lems. The Multiplication and Division Scale is composed of 
twenty-three problems. It is built on the same principle. 

Trabue Completion-Test Langimge Scales * 

These tests are composed of multilated sentences which the 
subject is to complete by filling in the words which make the 
**most sensible statement." Scales B, C, D, and E consist of 
ten sentences each, scaled so that they range in P.E. units of 
value from about 1 to between 10.5 and 11. The intervals be- 
tween sentences are nearly equal. Scales J and K have seven 
sentences each, ranging in value from a little more than 4 to 
about 12.5; and L and M have eight sentences each, ranging 
from almost 7 to a little above 11 P.E. units of difficulty. 

Seven minutes were given for completion of the sentences of 
each scale. In this amount of time all the subjects apparently 
had opportunity to show their maximum ability in such work for 
in most cases more sentences were attempted than were correctly 
done. The method of scoring was that suggested by the author 
of the scales. In cases where the lists given in his guide for 
scoring did not cover the answer in question the standard de- 
cided upon was recorded and used in any similar instances. This 
made for uniformity in scoring. Two points were given for each 
sentence completed correctly and one point for **each sentence 
completed with only a slight imperfection." 

Scale Alpha 2. For Measuring the Understanding of Sentences * 

Part II of this scale was used. Scale Alpha 2 is ''an im- 
proved and extended form" of **a provisional scale Alpha for 
measuring ability in paragraph reading." Part II begins with 

8 Hotz, Henry G., First Tear Algebra Scales, Teachers College, CJolumbia 
University, Contributions to Education, No. 90. 

* Trabue, M. R., Completion-Test Language Scales, Teachers College, 
Columbia University, Contributions to Education, No. 77. 

5 Thorndike, E. L., ''Measurement of Achievement in Reading," Teachers 
College Record, Vol. XV, No. 4. "An Improved Scale for Measuring Ability 
in Reading," Teachers College Record, Vol. XVI, No. 5 and Vol. XVII, No. i. 
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difficulty 7 and extends through difficulties 8, 8%, and 9. 
There are ten paragraphs in all, concerning the meaning of 
which twenty-four questions are asked. The subject's achieve- 
ment in the test is determined by his answers to the questions 
asked on each paragraph. 

The selections from Beta and from S are similar to the para- 
graphs in Alpha 2. Because Alpha 2 had been used twice it 
was considered best not to repeat it again. Therefore three 
paragraphs were selected from Scale Beta, and the one paragraph 
of S of the longer reading scale was added. Twenty-nine ques- 
tions are asked concerning the meaning of these paragraphs. 

In scoring the tests answers were divided into three classes, 
— correct, slightly incorrect, and wrong, for which 2, 1, and 
points respectively were given. The total number of points is 
the score given the individual. The time was sufficient for all 
but the very slowest to do as much as they could with the test. 
As an aid to uniformity in scoring record was made of types of 
answers concerning which there was question as to their class- 
ification. This was used to supplement the list given by the au- 
thor of the scale. 

Visual Vocabulary^ 

The Visual Vocabulary tests consist of lists of words which are 
to be classified accordingly as they mean a flower, an animal, a 
boy's name, a game, a book, something about time, something 
good to be or do, or something bad to be or do. The classi- 
fication of the word is indicated by writing a designated letter 
or word under it. 

The Thomdike Reading Scale A was given in February, 1916. 
It consists of forty-three words arranged by groups of five in 
ascending degrees of difficulty. The last group has only three 
words. The test given in February, 1917 was made up of one 
hundred and seventy words in fourteen groups selected from 
the Thomdike Scale A 2 plus four groups selected from its pro- 
visional extension. The groups begin with step 6}^ x and ex- 
tend through step 12j4. 

The Thomdike Reading Scale B, y series, was given in June, 
1917. It consists of one hundred and twenty words arranged 

« Thomdike, E. L., 'Iifeasureineiit of Achievemeiit in Reading," Teachert 
College Record, Vol. XV, No. 4, and Vol. XVII, No. 5. 
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in groups of ten. It is built on the same principle as the other 
two tests, nsing however a different list of meanings to deter- 
mine the classification. 

Composition ^ 

The subjects for the test in composition were : for February, 
1916, How I Would Spend Twenty Dollars ; for February, 1917, 
"What I Should Like to do Next Saturday ; for June, 1917, How 
I Should Like to Spend My Vacation, or A Narrow Escape. 
These were all rated by the Hillegas Scale for the Measure- 
ment of Quality in English Composition. The first set of com- 
positions was rated by from four to eight experienced judges. 
The average of their marks was taken as the score for the com- 
position. The second set was rated by four experienced judges 
and the third set by three of the four who rated the second set. 
Here also the ratings were averaged to determine the score for 
the composition. 

The time allowed for writing the composition was thirty min- 
utes for the first two sets and fifty minutes for the third set. 

Spelling ' 

The Ayres Measuring Scale for Ability in Spelling was used. 
The first time the tests were given fifty words were selected 
from the Q list. The Q list is rated as of a difficulty such that 
the average score of a seventh grade class should be 92 per cent. 
In the second testing fifty words selected from lists U, V, W, 
and X were given. The last time fifty words from lists T to Z 
inclusive were used. 

The words were pronounced by the regular teacher. Each 
word was pronounced twice and a third time if asked for. One 
point was given for each word spelled correctly. The teachers 
did not score the papers. 

Opposites Tests ^ 

The Opposites Tests consist of twenty words each. The pur- 
pose of the test is to determine the number of words having 

f Hillegas, Milo B., ''A Scale for the Measurement of Quality in English 
Composition by Young People," Teachers College Record, Vol. XIII, No. 4. 

s Ayres, Leonard P., A Measuring Bcale for AJnltty in BpeUing, Division 
of Education, Russell Sage Foundation, Bulletin E 139. 
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a meaning opposite the words of the list, which can be written 
in a given length of time. The **northH30uth" and the ** long- 
short" lists are of equal difficulty. These were used for the 
first two testings. The time allowed was seventy-two seconds. 
The ** high-low" list is made up of the easiest words of the other 
two and consequently the time was reduced. Forty seconds 
were allowed when it was given in June, 1917. 

The responses were classed as either right or wrong. One 
point was given for each correct response. 

Mixed Relations Test* 

In the Mixed Relations Tests a pair of words is given to indi- 
cate the relation desired in each response to a third word. There 
are twenty such series in each/ test. Before the test began a 
sample was exhibited and the explanation made that after the 
third word of each series a fourth word was to be written which 
would have the same relation to the third word that the second 
had to the first. In the first two testings one hundred and 
twelve seconds were allowed, but the third time this was reduced 
to ninety seconds. The responses were considered either right 
or wrong, one point being given for each one right. 

Easy Directions Test ® 

In the Easy Directions Tests the subject is directed to make 
a definite response such as: Cross out the smallest dot • • • , 
or Cross out the g in tiger. The two tests are of approximately 
equal difficulty. The ** smallest dot" test was given in Febru- 
ary, 1916, and the **g in tiger" test both times the tests were 
repeated. One point was given for each correct response. The 
time allowed was eighty-two seconds for the first two testings 
and eighty for the third. 

Hard Directions Test ® 

The Hard Directions Test is similar to the Easy Directions 
except that **the object here is to complicate the directions 
somewhat, by calling for conditional and alternative responses, 
etc." The first two or three directions are easy enough to in- 

• Woodworth, R. S., and Wells, Frederic Lyman, "Association Tests," 
Psychological Monogmphs, Vol. XIII, No. 6. 



Experimental Material and Method 17 

sure a proper start on the test and the rest are more compli- 
cated. Because of the ''conditional and alternative" responses 
the scoring is somewhat complicated. A standard of twenty- 
two possibilities for mistakes was decided upon and used con- 
sistently in scoring. From ** twenty-two" one was deducted for 
each wrong response. The time allowed was two minutes. 

All of the tests were either scored or their scoriQgs checked 
by one or the other of the two persons chiefly iaterested in the 
prosecution of this study, — except the Composition tests, the 
scores of which as has already been stated are averages, the 
Algebra tests, and twenty-eight of the seven hundred and niaety- 
two papers of the February 1916 testing. In scoring the papers 
and copying the scores extreme care was taken to avoid chance 
mistakes. This increased the amount of time consumed, but 
greater accuracy in scoring is needed for individual results than 
for group results. 

It is very essential to the purposes of this investigation that 
the scoring be uniform and that as fine discriminations as pos- 
sible be made because the achievement of the individual in spe- 
cific tests is the problem for study. An error in scoring which 
affects the group standing only slightly when carried over to 
the individual, although the same in absolute value, has rela- 
tively a much greater significance in the case of the individual. 

3. Special Testing 

After the third testiag of the entire group was completed 
in June, 1917, a special testhig of certain boys was made to 
compare their reactions under conditions of more detailed con- 
trol. In Section VI the results obtaiaed from the special test- 
ing are analyzed and compared with the results from the origi- 
nal testings. All three Spelling tests were repeated with three 
boys. The words were pronounced by the writer. Each word 
was pronounced twice and a third time if necessary. The ** long- 
short" and ** high-low" Opposites tests were repeated with four 
boys. The time for each was forty seconds. Five boys were 
given both Mixed Relations tests. Time: ninety seconds for 
each. Both Easy Directions tests were repeated with four boys, 
the time allowed beiag eighty seconds. These special tests were 
given in the office at the Speyer School. Not more than three 
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boys were tested at any one time. Since in these cases a low 
score had been made in one or more of the original tests it was 
suggested that probably this was caused by some disturbance 
or that the boy was not feeling well on the day of the test ; fur- 
ther, that probably he could do better and that an opportunity 
was then going to be given. The same explanation that was 
given at the original testing was made. 

TABLE VIII 
The Tests and the Times at Which They Webe Given 



February, 1916 

Woody Multiplication 

Series A 
Woody Division 

Series A 
Trabue Completion 

Scale B 
Trabue Completion 

Scale C 
Reading Alpha 2 

Part II 

Visual Vocabulary 

Reading 

Scale A 
Composition 

How I Would Spend 

Twenty Dollars 
Spelling 

50 words from 

Ayres Q List 
Opposites 

North— South 
Mixed Relations 

Good — Bad 
Easy Directions 

Smallest dot 

Jimcy 1916 
Trabue Completion * 

Scale D 
Trabue Completion * 

Scale E 



February, 1917 

Woody Multiplication 

Series B 
Woody Division 

Series B 
Trabue Completion 

Scale J 
Trabue Completion 

Scale K 
Reading Alpha 2 

Part II 

Visual Vocabulary 

Selection from 

Scale A 2 
Composition 

What I Should Like 

to do Next Saturday 
Spelling 

50 words from 

Lists U to X 
Opposites 

Long — Short y 
Mixed Relations 

Eye — See 
Easy Directions 

G in tiger 

Hard Directions* 



Jtme, 1917 

Hotz Algebra 

Add. and Subt. 
Hotz Algebra 

Mult, and Div. 
Trabue Completion 

Scale L 
Trabue Completion 

Scale M 
Reading 

Selections from 

Beta and from S 
Visual Vocabulary 

Scale B 

y series 
Composition 

How I Should Like to 

Spend My Vacation 
Spelling 

50 words from 

Lists T to Z 
Opposites 

High — ^Low 
Mixed Relations 

Good — Bad 
Easy Directions 

G in tiger 

Hard Directions * 



* These tests were used for a slightly different purpose from that of the eleven 
tests above. 



IV 
STATISTICAL TREATMENT 

1. Transmutation and Distribution op Scores 

When all the papers had been scored the first step was to record 
the scores of the seventy-two boys in such maimer that the score 
of every boy in each test could be identified. Tables XXXIX 
to XLI in the Appendix contain these results. A distribution 
table of the original scores was then made for each of the thirty- 
seven tests. The semi-interquartile-range (Q) of each of these 

3 1 

distributions was found by using the formula: — - 



2 

These Q's are given in Table IX. The reason for using the Q 
instead of the S.D., which was used in the preliminary investiga- 
tion, is discussed under topic two of this section. 

In order that a part of the statistical work could be done 
before the last tests were given and scored in June 1917, the 
scores of the group of seventy-five pupils who took the eleven 
tests in February 1916 and February 1917 were used for the 
distributions and transmutations. In June 1917 three of these 
seventy-five pupils were not present when the tests were given. 
Their scores in the two previous testings were dropped from 
further consideration in this study. This produced practically 
no change in the Q's from what they would have been if only 
the seventy-two pupils' records had been used to find the Q's, 
especially since of the three records missing one was in the first 
tertile and two in the second tertile in February 1916, and one 
in each tertile in February 1917. 

Using the Q's shown in Table IX the intervals of each dis- 
tribution according to the original scores were transmuted into 
intervals according to their value in terms of the Q of the origi- 
nal distribution. The frequencies in these intervals, grouped 
in intervals of one Q, are shown in Table X. They are 
graphically represented by Figs. 3a, b, c, to 13a, b, c. 

19 
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TABLE IX 

The Sbmi-Intebquabtile-Range (Q) of the Distribution of the 

Obiginal Scobes fob Each Test 



Feb. Feb, June Ju/ne 

Testa 1916 1917 1917 1016 

Woody Multiplication 2.13 1.21 

Woody Division 2.62 .67 

Hotz Alg. Add. and Subt 2.06 

Hotz Alg. Mult, and Div 2.67 

Trabue B, J, L, D 1.71 1.22 1.10 1.36 

Trabue C, K, M, E 1.32 1.45 2.03 1.58 

Reading Tests 4.26 4.59 3.13 

Visual Vocabulary 3.77 13.88 4.27 

Composition 5.69 4.17 5.14 

Spelling 1.60 3.94 3.13 

Opposites 1.89 .47 1.54 

Mixed Relations 4.58 2.95 2.93 

Easy Directions 2.44 1.49 .76 

Hard Directions 2.46 1.57 

TABLE X 

Distbibution of Scobes Tbansmuted Into Multiples of Q Above and 
Below the Median of the Obiginal Distbibution in Each Test 
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2. The Use of Averages and Variabilities 

Two methods of comparing the scores of an individual in all 
the tests were discussed in Section II. Under this topic of this 
section the reason for a slight variation of the method selected 
will be given, and excerpts from studies in which the method has 
previously -been used will be quoted. 

Because one purpose of this study is to discover the extremely 
variant scores, — ^those due to causes that prevent the individual 
from making a normal or characteristic reaction, and also those 
due possibly to unusual ability or the lack of ability — ^a method 
that would tend to cover up these sport scores should not be 
used, but rather a method which retains their variability in rela- 
tive proportions should be used. The standard deviation as a 
measure of variability gives more weight to the extreme items 
than to those nearer the central tendency. Therefore it would 
seem that a measure of variability which avoids such weighting 
should be chosen. The Q was chosen because with it the range 
of the items beyond it does not affect the measure of variability, 
— ^the number of items beyond a given point being the influencing 
factor. 

A minor reason for using the Q rather than the S.D. is found 
in the matter of mathematical inaccuracies introduced by the 
lopping off of fractions. The measures of the group are carried 
to two decimal places, the second being an approximation de- 
termined by the size of the third, and the measures of the indi- 
vidual are carried to one decimal place only with the same method 
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of approxhnation. In cases of five-tenths or more the pre- 
ceding figure is increased by one; in cases of less than five- 
tenths it is not changed. The S.D. being 1.4825 Q means that it 
would carry with it a relatively greater mathematical approxi- 
mation in each case. 

The following excerpts, one from a study by Naomi Nors- 
worthy and the other from a study by R. S. Woodworth, de- 
scribe and discuss the method more fully. 

We now have two series of grades in the same measurement, one set 
from mentally defective children and the other from ordinary school chil- 
dren. The usual method of comparing such results is to compare the 
records of one set of individuals with the central tendency of those of 
the others of the same age and sex. But in this case there were not enough 
defectives of any age to make the results gained from such treatment of 
any value, consequently a different method has to he adopted. The method 
used in dealing with the majority of the measurements was one which 
enabled me to compare the records of all of the defectives with l^ose of all 
the ordinary children without restriction as to age or sex. Another very 
decided advantage is the fact that the units of grading are identical 
throughout all the measurements, as will be evident from the following 
description. ^ 

The difference between the record of each defective in any test and the 
median for an ordinary child of the same age and sex was found. This 
difference was then transmuted into positive or negative multiples of the 
probable error as the case required. ... By thus transmuting the dif- 
ference between the grading received by defectives and ordinary children 
respectively in every test into multiples of the probable error of the ap- 
propriate age and sex I can compare the records of the 150 defectives tested 
with the 500 or 600 ordinary children, just as if I had 150 idiots and 600 
school children all of the same age and sex. Not only by this method c€Ui 
I consider all my cases together, but each test is, so far as is possible 
comparable with every other, irrespective of whether the trait examined 
is physical or mental. This, so far as I know, has not yet been done. . . . 
This method, then, provides a measure by which we can tell not only how 
far the idiots are below school children in the various traits tested, but 
how much farther below they are in one mental trait than in another and 
whether they are equally deficient in physical aAd mental traits.^ 

What is needed is a method of combining results which shall preserve all 
the refinement of the original measurements. Such a method exists, and 
is certainly familiar to statisticians; but it seems to be overlooked in many 
cases where it would prove of value.' 

Here follows a discussion of averages and variabilities. 

There is a way of eliminating both of the troublesome quantities — ^both 
the absolute value of the average and the absolute measure of variability. 
Let the average in each case be coimted as 0, i.e., let the individual's 
standing be expressed as a deviation above or below the average; and 
further, let the measure of variability be taken as the imit deviation, and 

iNorsworthy, Naomi, The Psychology of Mentally Deficient Children, 
Columbia University Contributions to Philosophy and Psychology, Vol. XV, 
No. 2, p. 60. 
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all deviations be expressed as fractions or multiples of this unit. (For the 
measure of variability, either the average deviation, or the mean square 
deviation, or the quartile, etc., may be chosen.) What this method does 
is to assign each individual's position in the distribution of the group: he 
stands, namely, above or below the group average, and so and so much 
above or below as compared with the average variation of the group. 

No assumption is made by this method as to the ratio between the 
variability and the group average; for the average is taken, as and the 
variability as 1, independently the one of the other. The only assumptions 
underlying the method are those involved in every use of averages and 
variabilities, namely, that l^e average means the same thing in respect to 
one distribution as in respect to another, and, likewise, that the measure 
of variability means the same thing in respect to the different distributions. 
Both of the assumptions are correct if the distributions are of the "normal" 
type, or if all the distributions belong to any one type. Were one distribu- 
tion normal, another markedly skew, and a third distinctly bimodal, 
neither the average nor the average deviation would mean quite the same 
thing in respect to the three, and the method would be illegitimate; but in 
such a case it is doubtful if the distributions ought properly to be com- 
bined at all. Mental tests usually give group distributions not very dif- 
ferent from the "normal" though tending on the whole to be somewhat skew 
in such a way that more individuals lie on the good side than on the bad 
side. The distributions for different tests do not differ much in shape, 
and no considerable error can be introduced by placing the average always 
equal to and the average deviation (or mean square deviation, etc.) 
always equal to 1.^ 

Although the method is fully described by the two quotations 
just given, a concrete illustration from this investigation may 
serve as an aid in projecting the method to this study. In Fig. 
2 are represented graphically the records of the same three in- 
dividuals whose scores according to the classification by rank 
are shown in Fig. 1. The reduction of the number of subjects 
from ninety-seven in the preliminary investigation to seventy- 
two in the study proper changed the serial numbers so that Case 
60 here is Individual 45 ; Case 65, Ind. 49 ; and Case 92, Ind. 71. 
The reduction of the number in the group makes it unfair to 
compare directly the range and placement of scores in values 
of Q with their range and placement by rank. Further com- 
parison in this respect will be made in Section VI. Fig. 2 is 
introduced here to illustrate the method. 

To compare these individuals in their standing in the same 

test, in terms of achievement in that test, their scores should 

be read by the scale at the right of Fig. 2. In spelling (S) 

for example, Ind. 45 is .IQ above the median of the group (Med. 

gr.), likewise, Ind. 49 is .IQ above the median of the group. 

They both received the same original score in spelling which 

2Woodworth, R. S., "Combining the Results of Several Tests; A Study 
in Statistical Method," The Psychological RevieWy Vol. XIX, pp. 97-101. 
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Fig. 2. Distribution of the Scores, in Values of Q, of Three Individuals. 
February 1916 Tests. The letters signify tests as follows: 

X — ^Woody Multiplication T — Composition 

W — ^Woody DiviBion S — Spelling 

B — ^Trabue B O — Opposites 

C — ^Trabne C R — Mixed Relations 

A — ^Reading Alpha 2 I — ^Easy Directioni 
V — Visual Vocabulary 

was 48. Ind. 71 is 1.2Q below the median of the group in spell- 
ing. His score was 46. The achievement in this spelling test 
is then the same for Individuals 45 and 49, and 1.3Q less for 
Ind. 71. The median achievement in one test is rated equal to 
the median achievement in another, and likewise, the Q deviation 
in one test is equated with that of any other. 

This explanation holds also in comparing the individual's 
achievement in different tests and the achievement of different 
individuals in different tests. For example, in relation to the 
median achievement of the group in all the tests Ind. 45 achieved 
the same in Trabue B and Mixed Belations as did Ind. 71 in 
Woody Division. All three of these scores are 1.0 Q above the 
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median of the group. The achievement of one individual in 
different tests is simply stated in multiples of Q above and be- 
low the median of the group, and therefore, compared with the 
median of the group in all the tests by reading from the scale 
at the right of the figure. 

To compare these individuals in their standing in the same 
test, in terms of their own median achievement, their scores 
should be read by the scale at the left of Fig. 2. The median 
line of this scale begins with the median score of the individual 
ranking highest as judged by median attainment, and connects 
the median scores of the three individuals. Using spelling again 
the figure shows that in comparison with their other scores Inds. 
49 and 71 are alike in achievement in spelling, — spelling repre- 
sents their median achievement — ^while the spelling achievement 
of Ind. 45 is 1.0 Q below his median achievement. This shows 
that the original scores cannot be taken at face value in diagnos- 
ing individual cases. This answers the question in Section I 
concerning the meaning that the same scores in a single test 
may have in connection with the achievement of different indi- 
viduals. Two scores of the same test having the same value 
by the original scoring may have very different meaning in the 
two distributions of the individuals' scores; one may be compara- 
tively high among the achievements of one individual, while the 
other may be comparatively low. Any two scores can be related 
to the median achievement of the respective individuals by re- 
ferring them, following the line marked out by the median of the 
individuals, to the scale at the left of the figure. 

In the excerpt above Woodworth points out that the validity 
of not only this but of any method of comparing or equating 
scores depends to a large extent upon the similarity among the 
forms of distribution of the measures. Figs. 3a, b, c to 13a, b, c 
show graphically the distributions of scores in values of Q in 
thirty-three of the tests. The distributions show no great dis- 
similarity except in the cases of lib, Opposites, February 1917; 
13b, Easy Directions, February 1917 ; and 13c, Easy Directions, 
June 1917, which are rather markedly skewed. The other 
distributions show skewness upward and downward to varying 
degrees, but are similar enough in general to be compared with 
but little loss in accuracy from this cause. 
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The effect of the skewness of the original distributions men- 
tioned above upon the Q is illustrated by Fig. 14. The piling 
up of scores reduces the extent of the semi-interquartile-range 
measured by intervals of the original distribution of the scores 
of the group. The reduction is more pronounced in the range 
above the median than in the range below the median. In this 
form of distribution the Q is 87.4 per cent of the Q of the nor- 
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Feb. 1917 




June 1917 


Fig. 3b. 


Woody Mult. Fig. 8c. 


Alg., Add. Subt 


" 4b. 


Woody Div. 


• 4c. 


Alg., Mult. Div. 


" 5b. 


Trabue J 


• 5c. 


Trabue L 


•• 6b. 


Trabue K ' 


• 6c. 


Trabue M 


•* 7b. 


Read. Alpha 2 


• 7c. 


Reading 


" 8b. 


Visual Vocab. 


• 8c. 


Visual Vocab. 


•• 9b. 


Composition * 


• 9c. 


Composition 
Spelling 


*• 10b. 


SpeUing 


• 10c. 


" lib. 


Opposites ' 


' lie. 


Opposites 


'• 12b.' 


Mixed Relat. 


* 12 c. 


Mixed Relat. 


*' 13b. 


Easy Direct. 


* 13c. 


Easy Direct. 



Figs. 3a, b, c to 13a, b, c. Distribution of Scores in Each Test Trans- 
muted into Multiples of Q Above and Below the Median of the Original 
Distribution. 

Feb. 1916 

Fig. 3a. Woody Mult. 

" 4a. Woody Div. 

•• 5a. Trabue B 

** 6a. Trabue C 

' ' 7a. Read. Ahpha 2 

** 8a. Visual vocab. 

' ' 9a. Composition 

*' 10a. Spelling 

" 11a. Opposites 

** 12a. Mixed Relat. 

'* 13a. Easy Direct. 

mal distribution.^ The reduction in the value of the Q increases 
the minus value of the low scores expressed in multiples of Q. 
It places them, when compared with scores of tests having a 
normal distribution, at a greater distance from the median than 
they would be if the test had been of such nature that the higher 
scores had been spread out approximating more closely the form 
of the normal distribution. Moreover, the high scores which 
would have been higher if the test had permitted are covered 
up by this form of distribution. In Fig. 14 a score 3Q below the 

sThomdike, E. L., Mental and Bodal Mea^swrementa, p. 73. Surface of 
Frequency of Form G. (The Q is derived from the values of a and the 
surfaces A and C are smoothed.) 
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Fig. 14. Showing the EflFect of Two Different Forms of Distribution 
upon the Q. 

median of the normal distribution would be 3.43 Q below the 
median in the skewed distribution. 



3. Redistribution op Scores 

After the intervals of the original distributions were trans- 
muted into multiples of Q the next step was to transmute each 
score into a multiple of the Q of its distribution. A table con- 
taining such transmutations of the 2664* scores grouped ver- 
tically for the tests and horizontally for the individuals was 
made. From a part of this table, — ^that part showing the trans- 
muted scores of eleven of the June 1917 tests — ^the chart shown 
in Fig. 15 was constructed. 

As already suggested under Topic 2, achievement by any two 
scores of one test or different tests can be compared directly 
by this method. The chart of Fig. 15 facilitates such compari- 
son of the scores of the June 1917 tests. The scales for the 
graph are the same as those of Fig. 2, — ^the one at the right gives 
values of Q above and below the median of the group, and the 
one at the left, values above and below the individual medians. 

For use in the calculations made, charts similar to this were 

*This would be the number if every boy had taken every test. Scores 
are lacking for pupil No. 40 in Composition and Spelling, June, 1917, and 
for pupil No. 27 in Woody Division, February, 1917. In these cases the 
median score of the group was supplied. Four scores are lacking in Trabue 
D and E each. These were not supplied because these two teste were not 
used in the group of eleven. 
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constructed for each of the other two testings using the eleven 
tests, the eight Trabue tests combined, the six mathematics tests 
combined, the five directions tests combined, and the three read- 
ing tests combined. In these charts each score of each individual 
is identified by a letter so that any two of them can be found 
and compared in respect to achievement in values from the 
median of the group or in relation to the other achievements 
of the individual. 

Up to this point the experimental material used in this study 
and the method of treating this material statistically have been 
considered. The subjects and the tests have been described. 
The subjects represent very closely typical seventh grade ability. 
The tests used were not devised for this special study but are 
tests which have been carefully standardized and used exten- 
sively in other investigations. A slight variation from the 
method of equating scores decided upon in the preliminary in- 
vestigation has been discussed. The Q r^er than the S.D. is 
used as the measure of variability because the extreme scores, — 
those which probably do not represent the individual's normal 
reaction — ^have no greater effect upon the Q than do other scores ; 
while they do have a greater effect upon the S.D. than other 
scores nearer the central tendency have. It has been shown that 
in a given test two scores having the same value do not necessarily 
have the same meanin^for the two individuals; but that the 
meaning of each scomt must be interpreted by comparison with 
the other achievemrats of the individual. For example, one 
of two scores eyitsl in value may be very low for one individual 
in comparisopr'with his other achievements while the same score 
for another individual may be equal to or above his median 
achievement. The next section will deal with the results found 
in connection with individual variability as compared with group 
variability. 



INDIVIDUAL VARIABILITY COMPARED WITH GROUP 

VARIABILITY 

1. The Amount of Individual Variability 

What is the amount of the individual's variability among the 
different tests? Stated more specifically do the scores of some 
individuals tend to be high, do the scores of others tend to be 
near the average of thp group, and do the scores of still others 
tend to be low ? Or are the scores of most individuals so spread 
out that there is no well defined mode? The answers to these 
questions involve other questions, namely: By what standard 
shall the individual's variability be measured, and by what 
method can the measurement be made? 

The unit of measurement already described will be used. 
The Q of the group will be taken as the unit or standard. The 
median achievement will be taken as the starting point and 
variability will be measured in terms of the amount of the devia- 
tion in either one or both directions from the median. What 
then is the amount of variability of the group ? It is the stan- 
dard or unit, one Q. Having now related the standard to the 
problem under consideration the question can be asked in more 
specific terms, namely : What per cent of the variability of the 
group in each test is the variability of the individual in all the 
tests, — ^that is, what per cent of the Q taken as the standard is the 
variability of the individual ? 

With the scores transmuted into values of Q and redistributed 
in charts such as that shown in Fig. 15, the range between the 
two extreme scores, the range between the median score and the 
last score above the median, the range between the median score 
and the last score below the median, and the range between the 
third score above and the third score below the median were 
found for each individual. Using a scale these were read di- 
rectly from the charts. The averages of these ranges for the 
group and for different divisions of the group are shown in Table 
XIII. The last of the ranges enumerated above, the range be- 
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tween the third score above and the third score below the median, 
is an approximation of the interquartile range. It covers a 
distance of 3 intervals on each side of the median whereas the 
interquartile range covers a distance of only 2.75 intervals on 
each side. It was used as a matter of economy in calculation. 
This distance was readily determined whereas the distance of 
2.75 intervals would have necessitated interpolation for the value 
in every case. The following correction was made for the ap- 
proximation of the interquartile range so that the results would 
be comparable with the Q of the original distributions. 

By the method used the approximation of the interquartile 
range covers a distance of 6 intervals, 3 intervals on each side 
of the median. The interquartile range covers a distance of 5.5 
intervals, 2.75 intervals on each side of the median. The extent, 
in terms of the Q of the original distributions, of 3/5.5 of the 
measures on each side of the median was found. The extent 
of 2.75/5.5 of the measures on each side of the median is desired. 
3/5.5 of 50 per cent = .2727. Using a table ^ of values of x/Q 
of the normal probability integral it is found that 27.27 per cent 
of the surface in each direction from the median corresponds to 
a distance of approximately 1.11 Q on the base line. Hence the 
values found are 111 per cent of the values desired. Dividing 
the values given in Table XIII (E) by 2 and making this cor- 
rection we have the values of the semi-iiiterquartile range or Q 
which are given in Table XI. 

TABLE XI 

Average of the iNDivmuAL Sbmi-Intebquabtile-Ranges in the 

Eleven Tests 
The table reads a^a follows: In February, 1916, the average of the 
Q's of the first quartile wa^s 80 per cent of the Q taken as the standard, etc. 





Quartile 


Tertile 


Total 


Corrected 
Total 




I 

.80 
.71 
.79 
.77 


II 

.90 
.86 
.76 

.84 

.76 


III 


IV 


I 

.80 
.76 
.76 
.77 

.69 


II 

.93 

.82 
.88 
.87 

.78 


III 






February 1916.. 
February 1917.. 

June 1917 

Average 


.91 
.87 
.94 
.91 

.82 


.99 
1.17 
1.29 
1.16 


.96 
1.14 
1.19 
1.10 


.90 
.90 
.94 
.91 


.81 
.81 
.85 
.82 


Corrected 

Average 


.69 


1.04 


.99 


.82 





iThomdike, E. L., Mental a/nd Social Measurements, p. 220. 
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Inspection of this table gives an answer to the question raised 
above, namely: What is the amount of the individual's varia- 
bility among the different tests? Measured in terms of Q the 
average individual variability is 82 per cent of the variability 
of the group. That is, the average semi-interquartile-range of 
the individual is 82 per cent of the average semi-interquartile- 
range of the group. The average range in ranks of the ninety- 
seven pupils studied in the preliminary investigation and given 
in Section II was found to be 82 per cent of the total range pos- 
sible. These two figures supplementing each other as they do, 
are convincing evidence of the large amount of variability among 
the achievements of these pupils in the different tests. This 
variability is evidence of the unreliability of one test or a small 
number of tests used for the purpose of educational prognosis. 

The table shows further that the individual variability is 
greater in the third testing than in either of the first two, but 
not enough greater to be of significance. The difference in vari- 
ability among the different divisions of the group as ranked by 
median achievement is consistent enough and large enough to 
be significant. When grouped either in quartiles or tertiles the 
lower ranking pupils are found to be more variable in their 
achievements. The corrected averages show that the variability 
of the fourth quartile is 50 per cent greater than the variability 
of the first ; and that the variability of the third tertile is 43 per 
cent greater than that of the first. The fourth quartile exceeds 
the standard adopted and the third tertile almost equals it. 

TABLE XII 

Distribution op the Individual Semi-Interquabtile-IIanges 
(Approximation) in the Eleven Tests 





Feb. 


Feh. 


June 


Value in Q 


1916 


1917 


1917 


2.0 to 2.4 






2 


1.5 to 1.9 


3 


o 


4 


1.0 to 1.4 


22 


20 


25 


.5 to .9 


43 


41 


34 


.0 to .4 


4 


6 


7 



Table XII gives the distribution of the individual semi-inter- 
quartile-ranges (approximation) for the three testings. It 
shows a slight increase in individual variability the longer the 
pupils remain in school. It means that there is a slightly greater 
range in the achievements of the individual pupils in the dif- 
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ferent tests. However, this difference is not great enough to 
base any conclusions upon it. 

The amount of individual variability can be further meas- 
ured by finding the total range, the range above the median, and 
the range below the median for each individual in his different 
achievements. The results are given in Table XIII. 



TABLE XIII 

AVEBAGES IN CONNECTION WiTH iNDIVIDnAL KaNGES IN SOOBES 

Transmuted Into Multiples of Q 

The table reads as follows: In February, 1916, the a/oerage total 
range of the pupils in the quartile racking highest was 4.14 Q, etc. 



(A) Average Total Range in the Eleven Tests 




Quartile 


Tertile 


Total 




I 


II 


III 


IV 


I 


II 


III 


February 1916 
February 1917 
June 1917 
Average 


4.14 
4.71 
4.21 
4.35 


4.14 
4.69 
3.97 
4.27 


6.39 
4.43 
4.59 
4.80 


5.04 
5.68 
6.42 
5.71 


4.03 
4.65 
4.05 
4.24 


5.15 
4.45 
4.55 
4.72 


4.85 
5.52 
5.79 
5.39 


4.68 
4.88 
4.87 
4.78 


<B) AVKRAGE . 


EIange Above Individual Medians in the Ele^ 


^N Tests 




Qimrtile 


Tertile 


Total 




I 


II 


III 


IV 


I 


II 


III 


February 1916 
February 1917 
June 1917 
Average 


1.70 
1.48 
1.73 
1.64 


2.07 
1.78 
1.79 
1.88 


2.00 
1.75 
1.98 
1.91 


2.14 
2.18 
2.65 
2.32 


1.72 
1.49 
1.73 
1.65 


2.16 
1.89 
1.83 
1.96 


2.05 
2.02 
2.55 
2.21 


1.98 
1.80 
2.04 
1.94 


(C) Average . 


[Iange Below Individual Medians in the Ele^ 


'EN Tests 




Quartile 

• 


Tertile 


Total 




I 


II 


III 


IV 


I 


II 


III 


February 1916 
February 1917 
June 1917 
Average 


2.44 
3.22 
2.47 
2.71 


2.07 
2.92 
2.18 
2,39 


3.39 
2.68 
2.61 
2.89 


2.90 
3.49 
3.77 
3.39 


2.31 
3.16 
2.32 

2.60 


2.99 
2.56 
2.72 
2.76 


2.80 
3.50 
3.23 
3.18 


2.70 
3.08 
2.74 
2.84 
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(D) Average Total Range in Cebtain Tests Combined 







Qimrtile 






T&rtile 




Total 




I 


II 


III 


IV 


I 


II 


III 


Eight Trabue 


3.57 


3.57 


3.66 


3.18 


3.69 


3.40 


3.33 


3.47 


Six Mathematicfl 


2.84 


3.43 


2.86 


3.81 


2.70 


3.38 


3.62 


3.23 


Five Directions 


1.91 


2.33 


2.66 


2.91 


2.11 


2.20 


3.05 


2.45 


Three Reading 


1.59 


1.74 


1.62 


2.45 


1.73 


1.57 


2.25 


1.86 


Average 


2.48 


2.77 


2.68 


3.09 


2.56 


2.64 


3.06 


2.76 



(E) Average Interquartile Range (Approximation) in the Eleven 
Tests 





Quartile 


Tertile 


Total 




I 


II 


III 


IV 


I 


II 


III 


February 1916 
February 1917 
June 1917 
Average 


1.69 
1.42 
1.57 
1.63 


1.79 
1.71 
1.50 
1.67 


1.81 
1.73 
1.88 
1.81 


1.97 
2.34 
2.57 
2.29 


1.60 
1.60 
1.52 
1.54 


1.85 
1.63 
1.75 
1.74 


1.92 
2.27 
2.37 
2.19 


1.79 
1.80 

1.88 
1.82 



Although the range is not so reliable as other measures of 
variability, still some deductions can be dravni from Table XIII 
which are significant. In all the parts of this table the fourth 
quartile shows consistently a marked increase in variability 
over the first, and likewise, the third tertile over the first, except 
in the case of the eight Trabue tests. This shows that among 
different abilities and in the same ability the range of achieve- 
ments of the low ranking pupils is greater than that of those 
ranking high. Is this because of the poor showing, oftentimes 
almost absolute failure, they make in some tests? Parts B and 
C of Table XIII bear specifically upon this question. The pupils 
ranking low consistently have a greater range above their med- 
ian achievement than do the pupils ranking high. Table XXIII 
shows that for all three testings there were twenty very low 
scores, almost absolute failures, in the highest tertile, and thirty- 
six very low scores in the lowest tertile. The difference between 
these two numbers is not sufficient to account for the greater 
range in ability on the part of the duller pupils. The results 
tend to show that the greater variability of the low ranking 
pupils is due to some factor inherent in the nature of their work. 

Parts A, B, and C of Table XIII should be compared with 
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the first three columns of Table IV, which show the greatest 
total range of achievements in the second and third quartiles, 
and also a smaller range below the median in the fourth quar- 
tile than below the median in the first quartile. The data of 
Table XIII, as has already been pointed out, are more reliable 
than those of Table IV. The results shown in Table XIII will 
be discussed further in connection with the results of Tableg 
XIV, XVIII, and XIX. 

TABLE XIV 

colipabison of the variability of individual ranges in the elevkir 

Tests by Quabtiles and by Tebtiles 

The per cents given are computed from the average of the average ranget 
in the three testings, — * Except the average of the Eight Trabue, Six 
Mathematics, Five Directions, and Three Reading Tests. 

The table reads as follows: In the eleven tests the total range of 
Quartile II is 98 per cent of Quartile I, etc. 





Per Cent Which the Variability of Each Quartile and Each 
Tertile is of Those Higher. (I is cons%dered highest) 




1 


2 


3 


4 


5 


6 


7 


8 


9 




Quartile 


Tertile 


According to 
Values of Q 


II 
of 
I 


III 

of 

I 


IV 

of 

I 


III 
of 
II 


IV 
of 
II 


IV 

of 

III 


II 
of 
I 


III 

of 
I 


III 
of 
II 


Total Range 


.98 


1.10 


1.31 


1.12 


1.34 


1.19 


1.11 


1.27 


1.14 


Kange Above 
Median 


1.15 


1.17 


1.42 


1.02 


1.23 


1.21 


1.19 


1.34 


1.13 


Range Below 
Median 


.88 


1.07 


1.25 


1.21 


1.42 


1.17 


1.06 


1.22 


1.15 


•Average 
Range of 
Four Groups 


1.12 


1.08 


1.26 


.97 


1.12 


1.16 


1.03 


1.20 


1.16 


Inter-Quartile 

Range 

(Approximation) 


1.09 


1.18 


1.50 


1.08 


1.37 


1.27 


1.13 


1.42 


1.26 



The data of Table XIV are computed from the averages of 
Table XIII. This table summarizes the evidence on the ques- 
tion as to whether the duller pupils or the brighter pupils have 
the greater range in their achievements. Column 3 shows that 
on an average the lowest quartile had a range above the individ- 
ual medians 42 per cent greater than the first quartile. Fur- 
ther, the range of the fourth quartile below the individual med- 
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ians was only 25 per cent greater than that of quartile one. 
However, these two figures, 42 and 25, should not be compared 
at face value. By the range of the tests used and the placement 
of median ability above the median of the range of the tests 
the possibility of large ranges above their medians was limited 
for the pupils ranking high, while in all other cases the range 
of the tests was sufficient to allow for the maximum individual 
range in either direction from the individual median. This 
would tend to make the 42 per cent increase in the range of the 
fourth quartile over the first somewhat greater than it should be. 

The number of pupils making the highest score possible in 
the different tests shows that the range of ability covered was 
not an important factor in limiting the variability of the pupils 
ranking highest. In 21 of the 33 tests the highest score possible 
was reached by none of the pupils ; in 9 it was made by a rela- 
tively small number ; and in only 3 tests was the highest score 
possible made by a relatively large number of the pupils. That 
the range of ability covered by the tests was not an important 
factor in limiting the variability of the highest ranking pupils 
is shown further by certain results in Table XIV. The per- 
centage of increase of the lower quartiles over the higher quar- 
tiles* is as great in the case of the approximation of the inter- 
quartile range as in the total range. If the range of ability of 
the tests had been operative to any great extent it should have 
affected the total range of variability to a noticeably greater ex- 
tent than the interquartile range. 

Another point should be mentioned in this connection. In 
the range below the median there is undoubtedly a factor which 
is not present in the range above. Low scores in these tests 
are sometimes caused by external conditions, — chance occurrences 
such as the dropping of a pencil, becoming amused at some part 
of a rate test, etc., while high scores are not so caused. High 
scores are the result of ability ; low scores are the result of either 
less ability or the failure of ability to function due to various 
causes. Thus in both cases the lower range is increased by this 
second factor which tends to make the percentage of increase 
in variability lower. 

Allowing for these corrections the results seem to show that the 
low ranking pupils of this group are inherently more variable in 
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their achievement than the pupils ranking high. As to the reason 
for the one exception suggested above, namely, the eight Trabue 
tests combined in which the low ranking pupils are least variable, 
the investigation offers no evidence. It may be that the tests 
are better standardized, or that the ability required for these 
tests is more specific, or there may be some other reason for the re- 
sults. It should be observed, however, that two problems are in- 
volved in this connection. One is the variability of individuals 
among different abilities and the other is the variability of indi- 
viduals in different testings of the same ability. A large amount 
of variability among several traits or abilities does not neces- 
sarily imply great variability among several tests of the same 
trait. The Trabue scales test a single trait while the eleven 
tests cover several traits. Ability in each trait may remain about 
the same relatively from one testing to another and still there 
may be great variability among the several tests. 

The amount of "variability among the different tests cannot 
be compared directly with the amount among combined similar 
tests, shown in Section D of Table XIII, because in each case 
the number of tests combined is different from the number of 
different tests* in Section A of the table. This could be ac- 
complished by some method of weighting but such will not be 
attempted here. 

2. Distribution of Individual Variability 

Several measures of the amount of individual variability have 
been found by taking different single measures of the range 
in achievement. The distribution of all the scores above and 
below the individual medians will throw more light upon this 
problem. Such distributions for the average of the three test- 
ings by tertiles and for the entire group by each testing are 
given in Table XV. The frequencies here are expressed in per 
cents so that the distributions for the eleven different tests may 
be compared later with the distributions for the combined sim- 
ilar tests. Figs. 16a to 17c show graphically the data in this 
table. 

2 These obviously are not all different tests in the sense of testing strictly 
different abilities. The two Trabue tests are of course for the same ability 
and the mathematics tests are for rather closely related abilities. The use 
of the phrase "eleven different tests" will be continued in the study with 
this limitation understood. 
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TABLE XV 

Distribution (in Peb Cents) of Soobes Above and Below the Individual 

Medians in the Eleven Tests Tbansmuted Into Multiples 

OF Q BY THE Original Distributions 

The Measure of Central Tendency is the Median of the Individual'^ 
Scores Transmuted into Multiples of Q. 
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Figs. 16a, 16b, and 16c represent the distributions of the 
scores of Tertiles I, II, and III respectively. From the form 
of these curves it is evident that in the total distribution of their 
scores the pupils ranking lowest, those of the third tertile, are 
most variable in their achievements. The mode of the third ter- 
tile is not so pronounced as that of the first tertile. The range 
above the median of the third is greater than that of the first, 
and the range below the median of the third shows more ex- 
treme cases. All three curves tend to bring out the difference 
between the distribution of scores above the median and the 
distribution below. The range below is greater and more reg- 
ular in its decline. The two halves of the curves show one sim- 
ilarity which is spurious. All the scores of 5Q or more are 
grouped into the last frequency because of the extreme range 
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FigB. 16a to 17c. Distribution (in Per Gents) of Scores Above and Below 
the Individual Medians in the Eleven Tests Transmuted into Multiples of 
Q by the Original Distributions. 



Figs. 16a, b, c. Averages of the 

Three Testings by Tertiles 
Fig. 16a. Tertile I 

" 16b. «' II 

*• 16c. •* III 



Figs. 17a, b, c. Entire Number 

by Testings 
Fig. 17a. Feb. 1916 

" 17b. Feb. 1917 

" 17c. June 1917 



in the graph which a few of the scores would have necessitated, 
— 18.7Q iQ one case. This suggests a relation between the high 
and low scores of extreme variability which does not exist. 

Figs. 17a, 17b, and 17c represent the distribution of scores 
above and below the iadividual medians in February, 1916, 
February, 1917, and June, 1917 respectively. Their significance 
is in their similarity. There is only one point of difference to 
note. It is the greater length of the curve ia Fig. 17c when it 
nears the base line. The increase is not enough to be especially 
significant, and, moreover, it is in that part of the curve which is 
least reliable. However, it is in accord with the slight iacrease 
which was found iq the Q of the third testing, and tends to 
show that these pupils became more variable ia their own achieve- 
ments the longer they remained in school. 
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TABLE XVI 

Distribution (in Feb Cents) of Scobes Above and Below the Individual 
Medians in Cebtain Tests Transmuted Into Multiples 

OF Q BY THE ObIGINAL DiSTBIBUTIONS 

The Measure of Central Tendency is the Median of the Individual's Scores 
Transmuted into Multiples of Q, 





Eight 


Six Mathe- 


Five 


Three 


Value in Q 


Trahue 


matics 


Directions 


Reading 


+6.0 to 










+4.5 to +4.9 










+4.0 to +4.4 










+3.6 to +3.9 


A 




.3 




+3.0 to +3.4 


1.1 


.6 






+2.6 to +2.9 


2.1 


1.1 


.6 


.9 


+2.0 to +2.4 


2.6 


3.0 


1.4 


^ 1.4 


+1.6 to +1-9 


3.5 


3.6 


4.7 


3.2 


+1.0 to +1.4 


9.7 


7.9 


7.8 


4.2 


+ .6 to + .9 


10.9 


14.1 


11.1 


10.6 


.0 to + -4 


19.7 


19.7 


24.1 


29.6 


.0 to — .4 


19.2 


18.5 


21.0 


25.6 


— .6 to — .9 


12.8 


14.6 


10.3 


8.8 


—1.0 to —1.4 


9.0 


6.5 


9.4 


6.0 


—1.5 to —1.9 


6.1 


3.5 


4.7 


3.2 


—2.0 to —2.4 


1.2 


3.0 


2.2 


4.6 


—2.6 to —2.9 


1.6 


1.8 


.6 


.6 


—3.0 to — 3.4 


.7 


.9 


.8 


.5 


—3.5 to —3.9 






.3 


.5 


—4.0 to —4.4 




.2 




.5 


—4.6 to —4.9 


.4 


.5 


.3 




— 6.0 to 




.5 


.3 





Table XVI gives data for the four groups of combined tests 
which are similar to the data of Table XV for the eleven dif- 
ferent tests. The frequencies of scores above and below the in- 
dividual medians are expressed in percentages of the total num- 
ber of scores in each combined group. Figs. 18 to 21 represent 
graphically the distributions of this table. Fig. 17b, represent- 
ing the distribution for the eleven tests in February, 1917 is re- 
peated in order to facilitate comparison. 

These figures cannot be compared directly because, as has 
already been pointed out, the number of tests is different in 
each case. Expressing the frequencies in percentages equates 
the surfaces of distribution and permits some inferences to be 
drawn concerning the general shape of the curves. Figs. 18 
and 19 representing the eight Trabue tests and the six mathe- 
matics tests are strikingly similar to Fig. 17b which represents 
the eleven tests. They do not show as much variability among 
the achievements of the individual as in the case of the eleven 
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Figs. 18 to 21. Distribution (in 
the Individual Medians in Certain 
by the Original Distributions. 

Fig. 18. Eight Trabue Tests Com- 
bined 

Fig. 19. Six Mathematics Tests 
Combined 

Fig. 17b. Feb. 1917 Testing (Re- 
peated) 
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Per Gents) of Scores Above and Below 
Tests Transmuted into Multiples of Q 

Fig. 20. >J'ive Directions Tests Com- 
bined 

Fig. 21. Three Reading Tests Com* 
bined 



tests but they show a rather surprisingly l^rge amount of vari- 
ability. The mode is more pronounced in that the width of great 
density is larger. The extent of the curves and their shape 
near the base line are quite similar. The curves of Figs. 20 
and 21 representing the five directions tests and the three read- 
ing tests differ from the others rather markedly. The smaller 
number of tests is probably a very potent reason for this, es- 
pecially in the latter case, 

TABLE XVII 

The Q of the Distribution of Scores Above and Below the Inihvidual 
Medians fob the Three Testings and fob Cebtain Tests Combined 

Teats Q 

February, 1916 81 

February, 1917 79 

June, 1917 80 

Eight Trabue 73 

Six Mathematics 71 

Five Directions 58 

Three Reading 46 
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The Q's of Table XVII were calculated from the distribution 
of BGOres above and below the individual medians shown in 
Tables XXXVII and XXXVIII in the Appendix. They should 
be compared with 1.00Q, the variability of the group used as the 
standard. The results here are slightly smaller than those of 
Table XI because the scores of the less variable pupils beyond 
the Q, but still less than the Q of the more variable pupils, re- 
duce the size of the Q in the total distribution. 

TABLE XVIII 

DiSTEiBUTion BT Tebtiles of Ranges Above and Below the iNuvmuAL 

Medians in the Eleven Tostb in Values or Q 

The Measure of Central Tendenog is the Median of the IntUviduaVa Beoret 

Tnmtmuted into Multiples of Q. 
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TABLE XIX 

Distribution bt "TtxeiiLKa of Ranoes Above akd Bei^ov the Individual 

UmuANs IN Cbktain Tests in Values of Q 

The Measure of Central Tendency it the Median of the Indimdwil'a Scores 
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Tables XVIII and XIX and Figs. 22a to 25e are introduced to 
supplement the data given in Tables XITL and XIV. "Nothing 
short of the entire distribution table is a complete measure of 
a variable fact. . . . " * The first nine eoltunns of Table 
XVIII are not separately represented graphically. The last 
three colomns are the averages of the respective tertiles for the 
three testings. Figs. 22a, b, and c show these averages for the 
three tertilea of the group. The curves, of course, are bimodal 

■ niomdike, E. L., Jf«»tal and Social Meaawrementt, p. 30. 
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Figs. 22a to 23c. Distribution by Tertiles of Ranges Above and Below 
the Individual Medians in the Eleven Tests and in the Eight Trabue Tests 
in Values of Q. The Measure of Central Tendency is the Median of the 
Individual's Scores Transmuted into Multiples of Q. 



Averace of Three Testings 
Fig. 22a. Ranges in Tertile I 
•* 22b. •' *• '* II 

" 22c. •• '* *• III 



Eight Trabue Tests 

Fig. 23a. Ranges in Tertile I 
•• 23b. '• " •• II 

•' 23c. *• " "III 



because they represent two variables, the ranges above and be- 
low the median. They are joined to show the increase in the 
extent of the ranges of the pupils of the third tertile over those 
of tertiles two and one. The curves show the greater range of 
the extreme scores of the third tertile both above and below the 
median achievement, being especially true of the range below the 
median. Here again the curves are lopped off at lOQ and more 
minus. Finally, these curves show one point upon which Tables 
XIII and XIV do not give definite evidence. The 25 per cent 
increase below the median in the range of tertile three over ter- 
tile one is not accounted for chiefly by a very few extremely 
variant ranges but by the greater variability of this tertile in 
general. Further, the few extreme ranges above the median 
count still less in effecting the 45 per cent increase in the range 
of the third tertile above. 
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Figs. 24a to 25c. Distribution by Tertiles of Ranges Above and Below 
the Individual Medians in the Six Mathematics and Five Directions Tests. 



Six Mathematics Tests 

Fig. 24a. Ranges in Tortile I 
" 24b. " •• •' II 

.. 24c. •* *• '* III 



Five Directions Tests 

Fig. 25a. Ranges in Tertile I 
•♦ 25b. " •• ** II 

'* 25c. " *• "III 



Figs. 23a, b, and c represent similar data for the eight Trabue 
tests; Figs. 24a, b, c such data for the six mathematics tests; 
and 'Figs. 25a, b, c such data for the five directions tests. The 
same number of cases, twenty-four above and twenty-four below 
the median, is represented by the surface of each graph. The 
figures for the combined tests disclose fewer extremely variant 
ranges. Figs. 23a and 23c show that the exception to the in- 
crease in the range of the pupils ranking low over those ranking 
high, namely, in the eight Trabue tests, is not the result of a 
few extremely variant ranges in the first tertile, but an inherent 
result of the form of distribution. 

The measures of extreme variability are emphasized not be- 
cause they are thought to have ordinarily more significance than 
measures of variability near the central tendency but because 
it is one of the chief purposes of this investigation to study the 
extremely variant achievements. 

The results of this topic and of the preceding one also tend 
to show that the pupils ranking lowest are most variable. Be- 
fore leaving the topic further comparison of these results should 
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be made with the results obtained by ranks and given in Table 
IV. The data of Table IV show that the pupils ranking lowest 
are no more variable than are the pupils ranking highest, and 
that the pupils ranking nearer the median are the most vari- 
able ones. The average range of quartiles two and three is 
shown to be 16 per cent greater than the average range of quar- 
tiles one and four. The range of the fourth quartile below the 
median is shown to be less than half as great as the range of the 
first quartile below the mediau, while by the classification by 
the Q variability it is shown to be 25 per cent greater than the 
range of the first. Likewise with the rest of the results of 
this table. 

Another point should be noted in this connection. Among 
this group of pupils there are no such types or pronounced ex- 
tremes as are represented by Fig. 1, constructed from the classi- 
fication by ranks. The piling up of scores at each end of the 
range as shown in Cases 60 and 92 is a spurious result of the 
method caused by the failure to retain the relative proportions 
of the original distributions. 

This comparison is additional evidence that the method of 
evaluating achievements in terms of ranks from the highest to 
the lowest in the group does not produce as reliable results in 
connection with the different achievements as does the method 
used in this investigation. 

3. Overlapping op Divisions op the Group 

There is another question in connection with this part of the 
problem of variability of the individual that should be asked 
concerning relative variability. Having single measures of the 
individual's variability and having the distribution of all his 
scores, the difference in ability of the different individuals should 
be known. Do the pupils who rank low and vary more in their 
achievements than the pupils who rank high, differ from those 
ranking high only a little in ability or do they differ a great deal t 
This difference can be measured by the per cent of overlapping 
of the scores among the different divisions of the group. Table 
XX gives the amount of overlapping of each quartile over the 
other according to three different points of reference, — ^the 
median, twenty-five percentile, and seventy-five percentile. 
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The three different testings, February 1916, February 1917, 
and June 1917, of Table XX are not sufficiently differentiated 
one from another to offer any points for special notice. Their 
overlappings are very similar. Consequently the average may 
be considered as typical of the three. The correspondence not 
the variation is the striking point of the overlappings revealed 
by the averages. The per cents of scores of Quartiles IV, III, 
and II that exceed the seventy-five percentile of Quartile I show 
.a very close agreement with the per cents of Quartiles I, II, and 
III that extend below the twenty-five percentile of Quartile IV. 
The comparisons are: 4.7 with 3.5, 6.3 ^ith 6.1, and 10.3 with 
10.3 Similar comparisons using the per cents of the same 
quartiles above the median of the first and below the median 
of the fourth give : 9.1 with 9.3, 18.6 with 14*1, and 24.7 with 
26.9. Other comparisons that might be made would disclose 
about the same agreement. This shows that the high scores of 
the low pupils overlap the high scores of the high pupils to an 
extent that corresponds very closely with the overlapping of the 
low scores of the high pupils over the low scores of the low 
pupils. 

TABLE XXI 

Difference in Achievement Between Quartiles Measured in Tebics 

OF THE Q Variability of the Group 

The table reads: Between the media/ns of Quartiles I and II there were 
22.2 per cent of the scores of Quartile I and 26.3 per cent of the scores of 
Quartile 11, etc. 



Per Cent of Scores Be- 


tween the Medians of 




Quartiles: 


I 


II HI 


wnd 


and and 


II 


III IV 


22.2 


18.2 23.1 


25.3 


15.5 19.9 


23.8 


16.9 21.5 


.95 


.65 .84 



Per cent of higher quartile overlapping lower.. 
Per cent of lower quartile overlapping higher . . 

Average overlapping 23.8 

Value in Q 

The results given in Table XXI are calculated from the aver- 
ages in Table XX. The following example illustrates the method. 
In Quartile I, 27.8 per cent of the scores are below the 
median of Quartile II. This leaves 22.2 per cent of the scores 
of Quartile I between its median and the median of Quartile 
II, etc. The values in Q are taken from a table of ''P.E. Values 
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Corresponding to Given Per Cents of the Normal Surface of 
Frequency, Per Cents Being Taken from the Median. ' ' * The 
results show a slightly greater diflference between the first and 
second and between the third and fourth quartiles than between 
the second and third. In median achievement the fourth quar- 
tile is 2.44Q below the first quartile. This shows that the vari- 
ability of the fourth quartile is that of a distinctly lower grade 
of work. 

There still remains another question of interest and impor- 
tance, namely: On the basis of individual achievement in the 
eleven tests how far from zero ability in the traits measured 
are the different divisions of this group! The answer to this 
would round out this section of the problem. It would mean 
that any score of an individual could be related not only to his 
other scores and to the scores of other individuals of the group, 
but also that its absolute value could be determined. These ab- 
solute values could be determined for the scores of the tests 
that have been built by scaling achievements from the zero point, 
but for the others they could only be estimated. Therefore, this 
part of the problem will have to be left unanswered. This serves 
to emphasize the need for more tests scaled from zero for the 
problems in educational diagnosis. 

For the purpose of individual diagnosis the variability of a 
test should.be standardized either by grade or by age of the 
pupils. Having such a measure the scores of an individual could 
be compared by transmuting them into multiples of this vari- 
ability without the labor involved in this investigation of deter- 
mining a measure of variability by testing a group. A large 
number of cases would reduce the unreliability of the measure 
of variability to a very small amount and would make it possible 
to secure very reliable measures of the relative achievements 
of the individual. 

Questions 3, 4, and 5 in the statement of the problem, con- 
cerning the amount of the individual's variability, the distribu- 
tion of individual variability, and the variability of bright, 
mediocre, and dull pupils, have been considered in this section 
of the study. The average amount of variability of the sub- 

*Trabue, M. R., Completion-Teat Language Scales, Teachers College, 
Columbia University, Contributions to Education, No. 77, p. 38. 
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jects of this investigation for the three testings in the eleven 
tests used has been found to be eighty-two per cent of the varia- 
bility of the group in the same tests. The variability in the 
last testing is slightly greater than in the first testing. The 
distribution of the achievements of the individual approximates 
the normal surface of frequency, the chief differences being a 
more pronounced mode and skewness downward from the me* 
dian. On the basis of three equal divisions of the group the 
bright pupils are least variable and the dull pupils are most 
variable in their achievements. The distributions of the achieve- 
ments for all three divisions have the same general form. 



VI 
EXTEEME VARIABILITY IN INDIVIDUAL CASES 

1. Extreme Variability in Different Tests 

Mention has already been made of certain probable causes of 
low scores, such as distractions of the moment due to chance 
occurrences, and abnormal mental or physical condition of the 
individual at the time of the test. It has also been pointed out 
that these factors are not effective in producing high scores, or 
if effective at all, only to a very slight extent in comparison 
with their effect in causing low scores. The effect of chance 
happenings and abnormal conditions upon the achievement of 
the pupil can be ascertained to some extent by classifying the 
extremely variable or erratic scores and also the boys who make 
them, and by comparing the results from re-examination under 
more closely controlled conditions with the original achieve- 
ments. 

As can readily be seen from the chart in Fig. 15 there are no 
distinct types of scores or individuals. Therefore the line di- 
viding extreme variability from the rest of the distribution must 
be arbitrarily drawn. A distance of 3Q from the individual 
medians was chosen for the location of this line. It was placed 
here because at about this point is the beginning of the second 
slow decrease in the normal curve of probability as characterized 
by a '* slow-rapid-slow" decline in either direction from the 
median. It includes 47.8 per cent of the scores on either side 
of the median. Scores 3Q or more from the median in each 
direction will be called erratic either plus or minus. In Table 
XXII all the erratic scores in the three testings are classified 
by testing for each test and by total and average for each test. 

The totals at the bottom of Table XXII show an increase in 
the number of erratic or extremely variable scores both plus and 
minus in each of the two later testings. Other things being equal 
this would show that the individual's abilities to achieve in these 
tests had increased at different rates. If the identical tests or 
tests of equated values had been used for the second and third 
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TABLE XXII 
;iirFEBBNT Tkstb 3 Q OR MoBB Plus or Minds 





Scores 3 V 

or More 

Plm 


Scores 3 Q 
or More 
Minus 


■5 
1 


Total No. 

of BarrUfi 

3 QmMore: 


Average No. 

of Scores 
3 Q or More. 
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1 




1 


1 


1 
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1 


Multiplica- 
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1 


3 


• 


4 


2 
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5 
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Add. Subt. 
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. 
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3 


5 


2 
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11 


4 
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1.8 
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3 
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.3 
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Spelling 
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4 
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17 
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32 
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* TUi tTpa at t«t not givvn. 
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testings the absolute amount of gain in each could be deter- 
mined. Such evidence would be more reliable than the evidence 
obtained, which measures the individual's increase in ability in 
relation to the rest of the group. 

This increase in the number of erratic scores might also be 
accounted for by the piling up of scores at the mode to a greater 
extent in the later testings thus reducing the extent of the Q 
and thereby increasing the transmuted value of a deviation of 
the same absolute amount in all three testings. Table IX con- 
tains some evidence on this point, but not enough to decide it 
either way. 

In the composition tests the Q in terms of the same scale is 
smaller in the last two testings than in the first. In the first 
testing there are two erratic scores, one plus and one minus; 
in the second testing there are three erratic scores; and in the 
third testing there are no erratic scores. In reading, the iden- 
tical test. Alpha 2, Part II, was repeated in the second testing. 
The Q is slightly larger in the second testing than in the first 
testing and the number of erratic scores is the same. The 
reading test of the third testing was composed of different se- 
lections and therefore the Q can not be compared with the Q 
of Alpha 2. The number of words in the spelling tests was the 
same throughout. The first test was the easiest and has a small- 
er Q than the last, showing a greater piling up of scores, but still, 
this Q which is much less than that of the last test lacks one 
of producing as many erratic scores as there are in the last test. 
Opposed to these results. the smaller Q of the second opposites 
test produces decidedly more erratic scores than the larger Q's 
of the other two tests. Other examples could be cited showing 
either result. 

The results of Tables IX and XXII, in so far as they bear on 
this question, show that there was no marked reduction of vari- 
ability caused by the repetition of the tests and therefore that 
the increase in the number of erratic or extremely variable scores 
in the later testings was not caused to a large extent by smaller 

Q's. 

Table XXII shows that in all three testings there were 108 
erratic scores. Of these 29 per cent were plus and 71 per cent 
were minus. This gives an average number of erratic scores 



54 Educational Diagnosis of Individual Pupils 

in each testing which is 4.5 per cent of the total number in each 
testing. That is, of every hundred scores four and one half 
were 3Q or more from the individual medians. It shows that 
the curves were skewed downward for in the normal surface of 
frequency only 2.2 per cent of items are beyond 3Q. 

The last three columns of Table XXII give the average num- 
ber of scores plus, minus, and plus and minus in each group of 
closely related tests. The first two of the three columns are 
the more significant. They show that in the rate tests practic- 
ally all the erratic scores are minus. It would be expected that 
they would show more erratic scores minus than plus because 
distractions of the moment operate in this direction and affect 
rate tests most of all. Excepting the Algebra Addition and 
Subtraction test which was given but once and which, more- 
over, was in process of construction, spelling caused more er- 
ratic scores than any other test, and all of these were erratic 
in the minus direction. 

The number of erratic scores resulting cannot be taken as 
a criterion for judging the unreliability of a test except in cases 
where scores are caused by chance happenings. Within limits 
the possibility for such results in a test would appear to have 
an inverse relation to the reliability of the test. The i)Ossibility 
of fine discrimination in achievements and the possibility for 
the functioning of a wide range of ability appear to be two fac- 
tors which have a direct relation to the value of a test in educa- 
tional diagnosis of the individual. 

Table XXIII summarizes the results in the first half of Table 
XXII in a different way from that in which they are summar- 
ized in the last half of that table. It shows the erratic scores 
plus and minus by tertile and total for each testing. The sig- 
nificance of the table is in the increase in the number of erratic 
scores both plus and minus of the second tertile over the first 
and of the third tertile over the second. The numbers of erratic 
scores plus are 4, 10, and 17, and the numbers of erratic scores 
minus are 20, 21, and 36 for Tertiles I, II, and III, respectively. 
Of the total number of erratic scores, the per cent in each tertile 
is as follows : first tertile, 22 per cent ; second, 29 per cent ; and 
third, 49 per cent. 
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TABLE XXIII 

NcMBEB OF Scores 3 Q ob More Plub ob Minds by Tbbtiles akd 

Total tor Each TtBnsa 





TertUe I 


Tertt/e 


f/ 


Tenrtle III 
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1 


1 


a; 


1 
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1 


1 


E^ 


Total 


Feb. IQie 


2 


5 


7 


4 


6 


10 


2 


7 


9 


36 


Feb. 1917 





7 


7 


4 


5 


9 


7 


16 


22 


38 


June 1017 


2 


8 


10 


2 


10 


12 


8 


14 


22 


44 


Itatftl 


4 


20 


24 


10 


21 


31 


17 


36 


53 


108 



2. EsTBEHE Variabilit; of Different Boys 
The classification of erratic scores by test in which they oc- 
curred is only part of their description. Under this topic an- 
other part ia given, — the classiBeation by boys who made them. 
The following questions are considered; What per cent of the 
hoys made erratic scores! Were there more or fewer hoys who 
made erratic scores the longer they remained in school! If a 
boy has erratic scores in 3ne testing what is the expectancy of 
his having erratic scores in one or both of the other testings! 
How do the high, median, and low ranking divisions compare as 
to the nnmber of boys making erratic scores! 

Table XXIV gives the number of boys making erratic scores 
in each test in one testing only and in all the combinations of 
testings. For example, the three boys counted in spelling in 
next to the last column of the table are not counted under the 
separate years. The results shown in this table are not different 
from what would be expected in the light of Table XXII. Spell- 
ing and the rate tests, — opposites, mixed relations, and easy 
directions — show the largest number of boys making erratic 
scores minus. In comparing these totals, division by the num- 
ber of times the tests were given is implied. The only point 
that should be noted in connection with the number of boys 
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making erratic scores plus is the complete lack ol such in the 
tests just mentioned, — spellii^ and the rate testa — except one 
case in easy directions. 

TABLE XXIV 

NuiCBEB 07 BoTs Eatikg Sgobeb 3 Q OB Moke Plus cs Minus in Each 

TiPE OF Test in Eitheb Osi ob Mobx Te^hnos 
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TABLE XXV 

NncBXB or Boss Maeino Diffebbnt Nttmbebb o7 Scobes 3 Q ob 1 

Plus ob Minus in All Thbbb or the Tbbtinos. Each 

Boy is Counted Only Once 

Jfo. of So. of 

Boyt Scorei 
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Table XXV shows the number of boys makmg different num- 
bers of erratic scores in all three testings combined. The re- 
sults show that 76 per cent of the boys made one or more erratic 
scores in all of the three testings. However, the impression 
given by this percentage is not quite fair. Too great a penalty 
is placed upon the malting of one erratic score in any one of 
the three testings. This measure should be supplemented by the 
average of the three testings. Tabl,e XXVI shows that in the 
testings taken separately there were 25, 30, and 34 boys respect- 
ively who made erratic scores. These numbers give an average 
for the three testings of 41 per cent of the boys who made erratic 
scores. 

Table XXVI also answers the question as to whether more 
or fewer boys made erratic scores the longer they remained in 
school, showing that in the second and third testings the num- 
ber was increasingly greater. In February, 1916, 35 per cent 
made one or more erratic scores ; in February, 1917, 42 per cent ; 
and in June, 1917, 47 per cent. 

Another question arises in this connection: Do the pupils 
who make erratic scores make more or fewer per pupil in the 
later testings? From the data of Tables XXIII and XXVI it 
is found that the number of erratic scores per pupil making 
erratic scores in the first testing is 1.04, in the second testing, 
1.27, and in the third testing, 1.29. The rest of the data in- 
cluded in Table XXVI show the number of scores plus and the 
number minus made by every boy in each testing and in all 
three testings combined. The table reads : In February, 1916, 47 
boys made neither plus nor minus erratic scores; 7 made one 
plus score each and no minus scores; 17 made one minus score 
each and no plus scores; and 1 made one plus score and one 
minus score. 

Table XXVII analyzes the number of boys opposite each num- 
ber of testings accordingly as they made only plus, only minus, 
or both plus and minus scores in the different testings. The 
last case at the bottom of the table is interesting. In one test- 
ing this boy made one or more erratic scores plus, but none 
minus ; in another testing he made one or more minus, but none 
plus; and in the remaining one of the three testings he made 
both plus and minus erratic scores. The table shows that the 
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TABLE XXVI 
NuMBEB OF Boys Makino Soobes 3 Q or Mobe Plus ob Miiors and 

NUMBEB of SCOBES OF ElTHEB TyPE THAT EAOH BoY MADE 



Feb. 1916 



3Qor 




Plus 




More 





1 2 


s 


1 



47 


7 




3 ' 
1 ' 


17 


1 




3 









Feb. 1917 



3Qor 




Plus 


More 





1 2 





42 


6 1 


- 1 

1 ' 


16 
4 


3 


3 







June 1911 


r 






T^c 


T/iree 


Tes^t^i^^ 






3Qor 




PI 


t« 8 


3 


Qw 






PI 


U 8 


More 


6? 


1 


2 3 


More 


t? 


1 


2 


3 4 5 





38 


7 


1 







17 


7 


1 


1 


1 o 

s - 

3 


19 
3 

1 


2 

1 




1 


1 
2 
3 
4 
5 


17 
8 
1 
3 
1 


9 
3 

1 


1 
2 





erratic scores made by one individual are not confined to one 
type. In the three testings 16 boys made erratic scores both 
plus and minus, and 7 of these made erratic scores both plus 
and minus in the same testing. There were 30 boys who made 
only minus erratic scores and 9 who made erratic scores in the 
plus direction only. Of the 72 pupils who were tested 76 per 
cent made one or more erratic scores ; 42 per cent made erratic 
scores in one testing only; 22 per cent in two testings; and 12 
per cent in all three testings. Of the boys who showed thi& 
amount of variability 'in their achievements in one testing, 46 
per cent showed it again in either one or both of the other two 
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TABLE XXVII 

Number of Boys Making Scobes 3 Q ob Mobe Plus, Minus, and Plus 
AND Minus in One ob Mobe of the Thbee Testings. Each 

Boy is Counted Only Once 





No. of 
Boys 


Type of 
Scores Made 


None of the 
Three Testings 


17 




One of the 
Three Testings 


r 8 

30 18 
I 4 


+ 


Two of the 
Three Testings 


16 


r 1 

8 
6 
1 


+ =t 


All Three 
Testings 


4 
2 

9 1 
1 

I 1 


1 ++I 1 +1 
1 1 l+l 
1 1 I++ 



testings. These figures show that a large percentage of the boys 
made extremely variable scores, — scores of 3Q or more above 
and below their median achievements. 

One more question asked at the beginning of this topic re- 
mains to be answered : How do the high, median, and low divi- 
sions compare in the number of boys making erratic scores! 
The answer could be predicted from Table XXIII. Table 
XXVIII gives the facts. In every testing the second tertile has 
more boys making erratic scores than does the first tertile, and 
in every » testing the third tertile has most of all, with one ex- 
ception, February, 1916, when there were more in the second 
tertile. The totals show a consistent increase in the number. 
Using the average number of boys who made erratic scores it 
is found that 24 per cent are in the first tertile, 32 per cent are 
in the second tertile, and 44 per cent are in the third tertile. 
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TABLE XXVIll 
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3. Reduction of VARiABiLirT bt BE-EZAUiKrATioN 

In the preceding topic of this section it lias been found that 
of all the erratic scores made 71 per cent were minus and 29 per 
cent were plus, and that of all the boys making erratic scores 16 
per cent made pins scores, 55 per cent minus scores, and 29 per 
cent made both plus and minus scores. The problem here is 
to determine the reduction in the number of erratic scores in 
these same tests which a special examination under closely con- 
trolled conditions would produce. 

The tests nsed in the special examination were identical with 
those used in the original testings. They were given about three 
weete after the third testing. The time allowed was as nearly 
equal to the time in the original testing as was possible. Espe- 
cial care was taken to insure the subject's best reaction in accord- 
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ance with the directions of the test. These tests are described 
in Section III under Special Testing. Be-examination of all 
the boys who made erratic scores in the tests in which they 
made them would hajve produced the most reliable results. This, 
however, was inexpedient and consequently only a part of the 
group were re-examined. 

In Table XXIX the results secured in the four tests used in 
the special examination are compared with the results of the 
original testings. The values in Q for all the tables in this topic 
were calculated by using the Q of the original distributions. 
Since the number of scores obtained in the special testing is 
not the same in every case as in the original testings and also 
because the number is not the same in all tests, the gains made 
are expressed in per cents. These are shown in the last column 



62 Edricatiomd Diagnosis of Individual Pupils 

of the table under Rednctiou in Per Cent of Erratic S<K)res. 
The special examination produced a reduction of 22 per cent in 
the Dumber of erratic scores on the basis of the total number of 
scores made. All of this reduction of variability should not be 
credited to the elimination of accidental or unusual occurrences. 
Some of it is probably due to improvement throi^h practice, es- 
pecially in the case of the tests which had been used in the third 
testing. 

TABLE XXX 
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i boja were re-examined in three teats each. 
i 7 points higher than his score in February, 1916, 
at to his Bcore in February, 191T, and a ecore G points lower 
)re in June, IQIT. In values of Q hia Bcorea were 4.3 Q higher, 
1.6 Q lower than hii reepectiye original scores. His boy 
a ability and 63.6 in variability. (1 being least variable.) 
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Table XXX analyzes by individnals the difference in value of 
Q between the original and special teatinj^. It shows that the 
number of gains in score greatly exceeds the number of losses. 
There are 26 eases of gain end 7 of loss and 2 cases where the 
score is the same as in the original testing. Prom the results 
in this table it is found that the average gain in points over the 
original scores is 5 in spelling, 1.1 in opposites, 5.9 in mixed re- 
lations, and 5.6 in easy directions. These results are snpple- 
Imentary to the results of Table XXIX. Of the sixteen Lndi- 
hduals re-examined all but one were below the median in ability 
pd all but one were more variable than the median. 

4. The Causes op Extreme VABiAsmrr 
I The results of this invest^ation show a rather large amount 

$ variability among the different achievements of the individual 
fchen compared with the variability of the group. By the meth- 

i described in Section IV the amount of this variation has been 

leasured and the results have been given in Section V. In this 

Wtion the extreme cases of variability,— those 3Q or more from 

^e median scores of the individual — have been segregated and 

aasified by testa and by individuals. Also the results from 

i re-examination of certain boys making extremely variable 
kores have been compared with the results from the original 
! of the same boys. Some evidence concerning the causes 

f extremely variable or erratic scores has appeared in connec- 
lion with other parts of the problem. Under this topic such 

vidence will be collected and some additional data wUl be dis- 

iiBeed. 

I The following causes appear to be factors which may operate 
■tdividuaUy or in combinations to produce extremely variable 

r erratic scores : 

a. The nature of the teste used. 

b. The adminiBtration of the testa. 

c. Accidental or unusual occurrences. 

d. Statistif^al treatment of results. 

e. The ability of the mdividusl in difTerent traite. 

J The effect of these causes individually can be discussed from 
I general standpoint, but the extent to which each one operated 



64 Educational Diagnosis of IndividtuU Pupils 

in producing specific erratic scores can not be definitely deter- 
mined from the data of this investigation. 

The tests lised are well standardized in degrees of difficulty. 
However, the amount of the increasing increments of difficulty 
which can be shown by the individual score and also the range 
of ability which is covered vary considerably. In group meas- 
urements increments smaller than the interval of the scale can 
be measured by interpolation within the interval of the scale, 
but in measurements of the individual such calculations cannot 
be made and consequently increments smaller than the interval 
of the scale cannot be measured. 

In transmuting the original scores into multiples of Q the value 
of a score was taken as one half interval higher than the actual 
score in all tests except composition in which case the value 
taken was the exact score. The values taken thus represent the 
midpoint of all the unmeasured achievements. Consequently 
the maximum displacement that could be produced by the in- 
terval of the scale is just barely less than one half the amount 
of Q representing the interval of the original distribution. The 
amount of Q representing one half the interval of each of the 
original distributions is shown in Table XXXI. 

TABLE XXXI 

The Amount of Q Representing One-Half the Interval of the 
Distributions of the Different Tests 

Feb. Feb. June 

1916 1917 1917 

Woody Multiplication 24 .42 

Woody Division .19 .75 

Hotz Algebra, Add. and Subt .26 

Hotz Algebra, Mult, and Div .19 

Trabue B, J, L 30 .41 .46 

Trabue C, K, M 38 .35 .25 

Reading Tests 12 .11 .16 

Visual Vocabulary .14 .04 .12 

Composition .09 .12 .10 

Spelling 32 .13 .16 

Opposites 27 1.07 .33 

Mixed Relations 11 .17 .17 

Easy Directions 21 .34 .66 

Table XXXI shows the extent to which the different tests 
may have failed to record the achievements of the individual 
due to the extent of the interval of the distribution when trans- 
muted into Q. Obviously an allowance for this can not be 
made for any given score because the known value nearest the 
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achievement has already been taken. The table shows that in 
only a few of the thirty-three tests could the interval of the dis- 
tribution be of much significance in causing extremely variable 
scores. It may effect not only the extremely varibale scores 
but also all other scores to the same extent. The point is that 
if it happens to act upon an achievement that should be just 
less than three Q in either direction from the median of the in- 
dividuaPs achievement it puts that score in the group called 
erratic in this study. The effect of the interval of distribution 
is significant only in connection with particular scores; it does 
not affect theoretically either the average amount of variability 
found in Section V or the total number of erratic scores found 
in Section VI. 

Another phase of the first factor in the causation of erratic 
scores is the range of ability covered by the test. The original dis- 
tributions show that the upper limit of the scale was not reached 
by any of the pupils in the following tests : arithmetic, algebra, 
language, reading, visual vocabulary, and composition. In the 
spelling tests the upper limit was reached by seven, one, and eight 
pupils in the three testings respectively. In the association tests 
only a few pupils reached the upper limit except in the oppo- 
sites test of February, 1917 and the easy directions tests of Feb- 
ruary and June, 1917 when a relatively large number of the pu- 
pils made the highest score possible. These tests did not permit 
the best pupils to show their ability in comparison with the rest 
of the group. Consequently the nature of the tests is a factor 
that probably prevented some extremely variable scores in the 
plus direction from the median. 

The distributions of scores resulting from ranges insufficient • 
to cover the ability of all the pupils increased to some extent 
the amount of Q representing the value of the score. This has 
been illustrated by Fig. 14 and discussed under Topic 2 of Sec- 
tion IV. It was pointed out there that the skewriess of the 
curve tends to reduce the extent of the Q of the original dis- 
tribution from what it would be in a normal distribution. This 
increases the variability of the scores when expressed in mul- 
tiples of the Q which is smaller than it would be in a normal dis- 
tribution. The effect is cumulative so that the faij^^er a score 
is from the median of the original distribution the greater is 
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the amount of spurious variability produced by this cause. On 
account of its cumulative effect this cause probably placed some 
scores in the group called erratic and did not have the op- 
posite effect on other scores. From the distributions it would 
appear that skewness could be effective to a marked degree only 
in the minus direction and only in the opposites test of February, 
1917 and the Easy Directions tests of February and June, 1917. 
The same cause could operate to make a score less than 3Q from 
the individual median when it should be more than 3Q from it 
if the median achievement of the individual is more than 3Q 
from the median of the group. However, there are no such 
cases in this investigation. The form of the distribution as a 
cause of extreme variability on the part of the individual may 
be attributed partly to the nature of the tests and partly to the 
statistical treatment of the results. 

The extent of the interval, the range of ability covered by 
the tests, and the form of distribution of the scores have been 
considered in connection with the nature of the tests as possible 
causes of extreme variability. The extent of the interval of the 
distribution was found to have but little effect in causing erratic 
scores, and since it has a compensating effect its significance is 
almost negligible. The range of ability covered by the test 
probably prevented some scores from being erratic in the plus 
direction. The form of distribution of the scores in certain tests 
tends to magnify the amount of variability in the minus direc- 
tion, and probably makes the number of scores in the minus 
direction larger than it should be. The last two causes probably 
account to some extent for the disparity between the number of 
extremely variable scores in the plus direction and the number in 
the minus direction. 

The practical significance of these causes of variability is 
illustrated by the use of standardized tests in the classification 
of pupils for the purposes of instruction. Here the individual 
rather than the group is the unit to be dealt with. In order to 
classify the individuals of a group with adequate exactness the 
range of the tests should be somewhat greater than the range 
of ability, and the interval between scores should be small. 

The ext'^'.ii to which the administration of the tests caused 
erratic scores will have to be judged by the conditions under 
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which the tests were given. In the first testing five of the eleven 
tests were given at the schools from which the pupils came. 
The remaining six tests were given at the Speyer School. The 
tests were administered by several graduate students under the 
direction of Professor Briggs. The second testing was conducted 
by Dr. Fretwell, who was one of the group giving the tests the 
first time, and by the writer. The third testing was conducted 
by the writer. Five tests of the first testing were given in the 
regular class rooms of the public schools. The remaining six 
tests of the first testing and all the tests of the last two testings 
were given in the regular class rooms of the Speyer SchooL 
The tests were given during scheduled periods of the school day. 
Pupils were tested in regular class groups of about twenty-five 
each. Instructions concerning the tests were brief and of similar 
nature for each test at each succeeding testing. Considering 
these conditions it is probable that the administration of the 
tests had but little effect in producing erratic scores. 

Accidental or unusual occurrences probably had a marked 
effect upon the scores of certain individuals. In one of the easy 
directions tests, for example, the completion of a face by the 
addition of the nose caused unusual merriment for certain pupils 
who by chance or design produced a rather grotesque face by the 
type of nose added. In a few cases, this diversion impeded ma- 
terially the speed of the work thus precluding a normal achieve- 
ment. 

In one or two instances in the spelling tests pupils, either be- 
cause of some accidental occurrence or because of slowness, fell 
behind the rate at which the words were being pronounced. 
This probably reduced the number of words spelled correctly. 
It is probable, however, that if any extremely variable scores 
were produced by accidental occurrences practically all were 
in spelling and in the rate tests. 

One bit of evidence bearing upon this cause and also upon the 
administration of the tests is found in connection with Table 
XXIX concerning the amount of reduction in per cent of erratic 
scores by special testing. When tested in groups of from two 
to five in spelling and in the rate tests, certain pupils who 
had made extremely variable scores in the original testings re- 
duced their number of erratic scores by from four to thirty-three 
per cent of the total number of scores. Since the special tests 
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were given about three weeks after the third testing it is prob- 
able that practice entered into the reduction of these per cents to 
some extent. Apparently accidental occurrences had little ef- 
fect upon the diflBculty tests. 

The statistical method of combining the results was chosen 
because it seemed to ** preserve all the refinement of the original 
measurements" to a greater extent than other methods. That 
the variability of certain scores has been magnified to some de- 
gree by this method has already been pointed out in connection 
with the nature of the tests. However, the method used is cer- 
tainly only a small factor in the causation of extremely variable 
scores. 

^ Consideration of the ability of the individual as a factor in 
the causation of extremely variable scores involves a study of the 
individual's variability from an angle slightly different from 
the attack made thus far. The problem of chief concern has 
been the variability of the individual from his own median 
achievement. The problem presented now is the variability of 
the individual in his own achievements in the same or similar 
tests at succeeding testings. If exactly the same tests had been 
used at each succeeding testing this variability could be meas- 
ured in terms of absolute amounts of gain and loss in the dif- 
ferent tests. Since the identical tests were not repeated in all 
cases such variability must be measured in a different way. 
This can be accomplished either by finding the difference be- 
tween the Q values of the scores in similar tests for the different 
testings, op by ranking the individual's achievements in each 
testing and finding the variation in ranks of the scores in ques- 
tion for the three testings. 

If the scores which vary greatly from the individual's median 
achievement are mere chance happenings among the total num- 
ber of achievements there should be a greater variation among 
the ranks in the three testings of the abilities in which such 
scores occur than among the ranks of the abilities which have 
no scores at so great a distance from the median. This of course 
would not hold if the ranking of all the individual's achieve- 
ments is caused by mere chance, but even if this is true the 
rank among the individual's achievements of the abilities hav- 
ing scores at the greatest distance from the median would have 
the same cause as the rank of any other ability and should not 



Extreme Variability in Indvoidual Cases 69 

vary in rank any more than the abilities having no scores at the 
extreme distance from the median. 

The variability of the individual in his own achievements 
in the three testings has been found for nine of the pupils most 
variable in the range of their scores in the first two testings 
and for eight of the pupils who had but one score of 3Q or more 
in the first two. testings^ and also for eight of the pupils most 
variable in all three testings, and eight who had but one score 
of 3Q or more in all three testings. The results are given in 
Table XXXII. In the treatment of the first two testings the 
two Trabue tests are combined because of their similarity thus 
making ten scores to be ranked. In the treatment of the three 
testings the two Trabue tests are combined, and the arithmetic 
and algebra tests are omitted because they are not similar enough 
to be comparable, thus making eight scores to be ranked. 

The results given in Table XXXII show that the variation of 
the scores at the greatest distance from the individual's median 
achievement is not essentially diflPerent from the variation of 
those nearer the median. They vary in rank about the same as 
the other achievements. The last two columns of the table show 
further that the variation of the erratic scores in the rankings 
of the pupils having only one erratic score is no greater than the 
variation in the rankings of the pupils who have the most erratic 
scores. Inspection of charts like the one shown in Figure 15 
shows that the absolute amount of variation of these extreme 
scores is greater than the variation of those nearer the median, 
but, like the items near the extremes of any distribution, they 
normally would be expected to vary more in absolute amount. 
The interpretation of these results is that in the ranking of 
the individual's achievements the variability of the extreme 
scores is caused by mere chance to no greater extent than the 
variability of the scores nearer the median. 

From one point of view this topic, — ^the causes of extreme va- 
riability, is a study of the forces that prevent perfect correlation 
between different testings of the same ability, and from the 
same viewpoint the whole investigation is a study of the lack 
of perfect correlation between abilities and between different 
testings of the same ability. If there were perfect correlation 
in the above cases there would be, of course, no variability among 
either the achievements of the individual in different tests or 
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of the table under Redaction in Per Cent of Erratic Scores. 
The Bpecial examination produced a reduction of 22 per cent in 
the number of erratic scores on the basis of the total number of 
scores made. All of this reduction of variability ahoiild not be 
credited to the elimination of accidental or unusual occurrences. 
Some of it is probably due to improvement through practice, es> 
pecially in the case of the tests which had been used in the third 
testing. 

TABLE XXX 
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Table reads: In spelling 3 boys were re-examined in three tests each. 
One boy made a score 7 pointH higher than his score in February, 191S, 
a score equal to his score in February, 1017, and a score 6 points lower 
than his score in June, 1917. In values of Q his scores were 4.3 Q higher, 
equal, and 1.5 Q lower than his reapecti-ve original scores, ^is boj 
ranked 69 in ability and 63.6 in variability. (1 being least variable.) 
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Table XXX analyzes by individuala the difference in value of 
Q between the original and special testit^. It shows that the 
number of gains in score greatly exceeds the number of losses. 
There are 26 cases of gain and 7 of loss and 2 cases where the 
score is the same as in the original testing. E^om the results 
in this table it is found that the average gain in points over the 
or^^al scores is 5 in spelling, 1.1 in opposites, 5.9 in mixed re- 
lations, and 5.6 in easy directions. These results are supple- 
to the results of Table XXIX. Of the sixteen indi- 
re-examined all but one were below the median in ability 
but one were more variable than the median. 



4. The Causes op Extreme Vabiabiuty 
'esults of this investigation show a rather large amount 



2^ 

^^^^B^^nbility among the different achievements of the individual 
^^^^^^^^m pared with the variability of the group. By the mcth- 
Llbrary jribed in Section IV the amount of this variation has been 
^ed and the results have been given in Section V. In this 
I the extreme cases of variability,— those 3Q or more from 
j3ian scores of the individual — have been segregated and 
■d by tests and by individuals. Also the results from 
amination of certain boys making extremely variable 
nave been compared with the results from the ordinal 
t the same boys. Some evidence concerning the causes 
fcmely variable or erratic scores has appeared in connec- 
jth other parts of the problem. Under this topic such 
Be will be collected and some additional data will be dis- 

■oUowing eai^es appear to be factors which may operate 
pally or in combinations to produce extremely variable 



The nature of the teits used. 

The adminiatTatioD of the tests. 

Accidental or unusual occurrences. 

BtatiHtical treatment of results. 

The ability of the individual in different traits. 

Iffect of these causes individually can be discussed from 
standpoint, but the extent to which each one operated 
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of the table under Reduction in Per Cent of Erratic Scores. 
The special examination produced a reduction of 22 per cent in 
the number of erratic scores on the basis of the total number of 
scores made. All of this reduction of variability should not be 
credited to the elimination of accidental or unusual occurrences. 
Some of it is probably due to improvement throi^h practice, es- 
pecially in the case of the tests which had been used in the third 
testing!:. 

TABLE XXX 
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Table reads: In epelliug 3 bo^s were re-examined in three ti 
One bo; made a score 7 points higher than his score in Fehruary, ifl 
a score equal to his score in Februarj, 1017, and a score 6 points Um 
than his score in June, 1917. In values of Q his scores were 4.3 Q higU 
equal, and 1.6 Q lower than his respective original scores. This ' 
ranked S9 in ability and 63.6 in variability. ( 1 being least variable. ) 
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Table XXX analyzes by individuals the diflPerence in value of 
Q between the original and special testings. It shows that the 
number of gains in score greatly exceeds the number of losses. 
There are 26 cases of gain and 7 of loss and 2 cases where the 
score is the same as in the original testing. Prom the results 
in this table it is found that the average gain in points over the 
original scores is 5 in spelling, 1.1 in opposites, 5.9 in mixed re- 
lations, and 5.6 in easy directions. These results are supple- 
mentary to the results of Table XXIX. Of the sixteen indi- 
viduals re-examined all but one were below the median in ability 
and all but one were more variable than the median. 

4. The Causes of Extreme Variability 

The results of this investigation show a rather large amount 
•of variability among the different achievements of the individual 
I'When compared with the variability of the group. By the meth- 
jod described in Section IV the amount of this variation has been 
I measured and the results have been given in Section V. In this 
Election the extreme cases of variability, — ^those 3Q or more from 
J'the median scores of the individual — ^have been segregated and 
[ classified by tests and by individuals. Also the results from 
a re-examination of certain boys making extremely variable 
f scores have been compared with the results from the original 
p tests of the same boys. Some evidence concerning the causes 
of extremely variable or erratic scores has appeared in connec- 
\t tion with other parts of the problem. Under this topic such 
evidence will be collected and some additional data will be dis- 
cussed. 

The following causes appear to be factors which may operate 
individually or in combinations to produce extremely variable 
or erratic scores : 

a. The nature of the tests used. 

h. The administration of the tests. 

c. Accidental or unusual occurrences. 

d. Statistical treatment of results. 

e. The ability of the individual in different traits. 

The effect of these causes individually can be discussed from 
a general standpoint, but the extent to which each one operated 
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of extremely variable scores in the plus direction from the in- 
dividual's median which would have appeared if the range of 
ability covered by the tests had been greater. The nature of 
the tests and the statistical treatment of the results seem to have 
magnified the amount of variability of a relatively small propor- 
tion of the scores. The administration of the tests, in so far as 
it can be judged by the conditions of the testings, had prac- 
tically no effect upon the variability of the scores. Accidental 
or unusual occurrences probably caused a few erratic scores. 
Under a more detailed administration of the tests such occur- 
rences and their effect could be definitely accounted for in the 
results. From the evidence of this study it appears that the 
ability of the individual is the greatest of the five factors in the 
causation of scores which vary 3Q or more from his median 
achievement. 



vn 
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COEEELATION BETWEEN MEASUEES OP ABILITY, 

MEASUEES OF VARIABILITY, AND MEASUEES 

OF ABILITY AND VARIABILITY 

1. Correlation Between Measures of Ability 

The results that have been set forth up to this point have dealt 
with variation. They may be considered as showing certain pos- 
itive relations, but in an indirect way. In this section of the 
investigation different relations will be studied by means of co- 
efficients of correlation. The last question in the statement of 
the problem will be considered. This question concerns the re- 
lation between different measures of ability, the relation between 
different measures of variability, and the relation between meas- 
ures of ability and variability. 

The first coefficients that are given are between different 
methods of ranking pupils for composite achievement. Three 
methods were used. First, each one of the seventy-two pupils 
was ranked by the average of his eleven ranks. That is, the 
pupils were ranked from one to seventy-two in each test. The 
eleven ranks of each pupil were then averaged and these aver- 
ages were ranked from one to seventy-two, the smallest being 
ranked one. The second method was the same as the first except 
that the median rank was used instead of the average rank. The 
third method was by median rank in the eleven tests as obtained 
from the scores transmuted into multiples of Q. Using the me- 
dian of the individual's ranks in values of Q the pupils were 
ranked from one to seventy-two as in the other methods. 

Three correlations were then calculated between the rankings 
by each method, — February 1916 with February 1917 ; February 
1917 with June 1917 ; and February 1916 with June 1917. The 
coefficients of these correlations are given in Table XXXIV. In 
calculating all the coefficients in this section the formula 

= 1_._6JD!_ 
^ n{n^—l) 
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was used. The value of these coefficients in terms of the Pear- 
son r has been inferred from a table ^ of such values. In all cases 
the inferred value of the coefficient is given. The unreliability 
of the coefficients was determined by the formula 

1 — r2 



P.B. m .6745 

i.r — oht, r 



n 



II 



The probable divergence of the true coefficient of correlation 
from that obtained from a limited random selection of* related 
pairs, is a variable fact with a mode at 0, and a variability which 
serves as the measure of the unreliability.''^ The P.E. is the 
measure limiting the fifty per cent of this variability which is 
nearest the coefficient obtained. 



TABLE XXXIV 

COBBELATION BETWEEN COMPOSITE RANKINGS IN ABILITY 



Average Kank by Rank in Eleven Tests 

Feb. 1916 with Feb. 1917.. 

Feb. 1917 with June 1917.. 

Feb. 1916 with June 1917.. 
Median Rank by Rank in Eleven Tests 

Feb. 1916 with Feb. 1917.. 

Feb. 1917 with June 1917.. 

Feb. 1916 with June 1917.. 
Median Rank by Values of Q in Eleven Tests 

Feb. 1916 with Feb. 1917.. 

Feb. 1917 with June 1917.. 

Feb. 1916 with June 1917.. 



.77, 

.78 

.68 

.69 
.73 
.53 

.69 
.69 
.54 



P.E. 
of r 

.03 
.03 
.04 

.04 
.04 
.06 

.04 
.04 
.06 



The method of ranking the pupils by their average achieve- 
ment gives distinctly higher coefficients of correlation than either 
of the other methods. The results obtained by ranking them in 
ability by the median of their eleven ranks agree very closely 
with the results obtained by ranking them by their median rank 
in values of Q. Coefficients of correlation obtained from both 
of these methods are approximately 10 per cent lower than the 
coefficients obtained from the method by average rank. The 
reason for the difference between the coefficients obtained from 
the rankings by the average of the eleven ranks and the coef- 
ficients obtained from the rankings by the median of the eleven 
ranks is obvious. With only a few measures a small difference 
in the median score resulting from chance error or the inherent 

1 Thorndike, E. L., Mental and Social Measurements, p. 225. 

2 Ibid., p. 193. 
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lack of fine discriminations on account of the small number of 
tests, affects the median rank of the individual much more than 
several such differences affect the average rank. In the latter 
case such differences tend to offset each other or if they do not 
entirely balance each other they enter into a composite where so 
much does not depend upon a single measure. 

One other point should be brought out in connection with the 
coefficients of correlation in Table XXXIV. By each of the 
three methods the correlation between the February 1916 and 
June 1917 rankings is about 10 per cent lower than the corre- 
lation between the rankings of the testings closer together in 
point of time. It has been found in another section of the study 
that the number of pupils making erratic scores and the num- 
ber of erratic scores per pupil increased with each succeeding 
testing. Granting that there was improvement in all the abili- 
ties tested this shows that the amounts of improvement of dif- 
ferent pupils in their different abilities were increasingly dis- 
proportionate the longer the pupils remained in school. The 
coefficients of correlation mentioned above tend to show that the 
improvement of the pupils in composite ability also was made 
at varying rates, and that the rate of improvement of different 
pupils did not fluctuate thus overcoming the inequalities, but 
rather that the inequalities became more pronounced the longer 
the pupils remained in school. 

The practical importance of such varying rates of improve- 
ment bears upon the length of time an evaluation of an achieve- 
ment by such tests can be considered as a valid index of the abil- 
ity of the pupil. As examples of such variation two cases from 
this investigation are cited. Pupil No. 51 ranked 67 by the tests 
in February, 1916 ; 35 in February, 1917 ; and 8 in June, 1917. 
This pupil was placed in group 6 in school in February, 1916. 
By the judgment of the teachers he was advanced to group 5 in 
April, group 4 in May, and to group 3 in June, 1916. In Feb- 
ruary, 1917 he was in group 2 and in June, 1917 he was in group 
1. Pupil No. 7 ranked 28 by the tests in February, 1916; 39 
in February, 1917 ; and 63 in June, 1917. In February, 1916 
this pupil was placed in group 3, in June, 1916, he was in group 
4, in February, 1917, in group 5, and in June, 1917 he was still 
in group 5. 
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2. Correlation Between Measures of Variabiutt 

Is there any constancy in the variability of the iildividual's 
achievement? This question can be studied by finding the 
amount of correlation between measures of variability in the 
different testings. Two methods of ranking the pupils are used. 
One method of ranking is by the extent of their entire range in 
the eleven tests. The individual having the smallest range in 
multiples of Q was ranked one, least variable, and the pupil 
having the largest range was ranked seventy-two, most variable. 
Th^ other method is by the approximation of the interquartile 
range. The ranking was made in the same manner, the least 
variable being ranked one. 

The pupils were ranked according to variability in each of 
the three testings by both of these methods. The three combi- 
nations of the rankings by each method were then correlated. 
The coefficients are given in Part A of Table XXX V. The re- 
sults show consistently a small positive relation between the 
amount of variability in the three testings. Both methods show 
about the same results. 

TABLE XXXV 

Correlation Between Measxtees op Variability in the Eleven 
Tests at the Different Times They V^ere Given 

P.E. 
of r 

(A) Kange in Values of Q 

Feb. 1916 with Feb. 1917 32 .07 

Feb. 1917 with June 1917 27 07 

Feb. 1916 with June 1917 17 08 

Interquartile Kange in Values of Q 

Feb. 1916 with Feb. 1917 20 08 

Feb. 1917 with June 1917 29 07 

Feb. 1916 with June 1917 18 08 

(B) Range in Values of Q 

Feb. 1917 with Teachers' Ratings 20 08 

Jime 1917 with Teachers' Ratings 20 08 

Interquartile Range in Values of Q 

Jime 1917 with Teachers' Ratings 22 07 

In Part B of Table XXXV three of the rankings of Part A are 
used to correlate with the teachers' ratings in variability. These 
were secured as follows. Each teacher rated every pupil he or 
she had had in class as to the character of the work done. Un- 
der one of the three headings, — consistent, variable, and erratic 
— ^the teacher was asked to ** check either the character of the 
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work in general or the character of the work in each subject/' 
Variable was to be considered as the step between consistent and 
its opposite, erratic. Prom six to eleven ratings were thus se- 
cured for each pupil. These were turned into per cents of con- 
sistenty variable, and erratic ratings. The percentage of con- 
sistent was weighted by three, the percentage of variable by 
two, and the percentage of erratic by one. !Prom the totals 
of these weighted percentages the pupils were ranked from one to 
seventy-two. The largest percentage was ranked one, least vari- 
able, and the smallest seventy-two, most variable. 

The coefficients in Part B resulted from correlating these rank- 
ings with the rankings made from the range in the eleven tests. 
Here again, although not high, the coefficients show consistently 
a small amount of positive relation. 

3. Correlation Between Measures op Abh^ty 

AND VaRIABUjITY 

Having studied the resemblance between measures of ability 
and the resemblance between measures of variability the ques- 
tion naturally follows: What is the relation between ability 
and variability? The results from such correlations are shown 
in Table XXXVI. 

TABLE XXXVI 

Cobbelatiox Between Measuees of Ability and Vabiabujty in thb 
Eleven Tests. (Highest Ability and Least Vabiability 

Ranked One.) 

Feb. Feb, June 

1916 1917 1917 

Ability by Median Rank in Values of Q 

with V .19 .31 .33 

Variability by Range in Values of Q 



Ability by Median Rank in Values of Q 

with 5- .26 .45 .43 

Variability by Inter-Quartile Range in Q 



}■ 



Composite Ability by Average Rank in Three Testings 

with 5- .39 

Composite Variability by Range in Three Testings 



Composite Ability by Average Rank in Three Testings 

with y .CS 

Composite Variability by Inter-Quartile Range (ap- 
proximation) in Three Testings 



1 
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The rankings used in the two preceding topics were used to 
find the correlation between the ability and the variability of 
these pupils. Variability is correlated with ability for each of 
the three testings, first, by using the entire range in values of 
Q as the measure of variability, and second, by using the ap- 
proximation of the interquartile range as the measure of varia- 
bility. The median rank in values of Q is used as the measure 
of ability in both cases. The pupil having the highest median 
score is ranked one in ability and the pupil having the smallest 
range is ranked one in variability. The correlations result in 
positive coefficients in all cases, and interestingly, in greater 
amounts of relation when the coefficients secured in the later 
testings are compared with the coefficients of the first testing. 

From the relations shown by the coefficients of correlation in 
this section the following summary may be made. Higher co- 
efficients of correlation were obtained by ranking these pupils 
by their average achievement than by ranking them by their 
median achievement. When the number of tests given is rela- 
tively small the median is affected much more by slight devia- 
tions than is the average. The teachers' ratings in variability 
show positive coefficients when correlated with the variability 
as shown by the tests. The relation between ability and varia- 
bility as expressed by coefficients of correlation is not great but 
is coi^istently positive. It was greater in the later testings 
than in the first testing. 



VIII 
CONCLUSIONS 

The questions asked in connection with the statement of the 
problem may be grouped under four headings. Although all of 
these questions have not been fully answered, the following con- 
elusions seem to be justified in view of the results obtained by 
testing at three diflPerent times, during a period of a year and 
a half, seventy-two junior high school boys with a group of 
eleven standardized scales and tests. 

A, Concerning methods of comparing or equating individual 
measures of achievement. 

The method of comparing the scores of an individual by ranks 
from highest to lowest in a group is not satisfactory for the pur- 
pose of diagnosing individual achievements. By this method 
much of the refinement of the original measures is lost. The 
method of transmuting the original scores into multiples of a 
measure of variability of the group produces more reliable re- 
sults because practically all of the refinement of the original 
measures is preserved. The semi-interquartile-range or the av- 
erage deviation is to be preferred to the mean square deviation 
as a measure of variability for this kind of statistical treatment 
as the latter weights too heavily the extreme and erratic scores. 

B. Concerning the amount and distribution of individual 
variability. 

The variability of the individual in these tests is a large frac- 
tion of the variability of the group. The average amount of in- 
dividual variability, measured in terms of the Q, is 82 per cent 
of the group variability. This is evidence of the unreliability 
of one or a few tests for the purpose of educational prognosis. 

The tests used in the second and third testings are not in all 
cases repetitions of the same tests or tests comparable in the 
amount of absolute variability, but the results of certain tests 
which are comparable tend to show that the absolute amount of 
group variability is about the same in all testings. The indi- 
vidual variability in terms of the group variability is the same 
in the first and second testings, and slightly greater in the third. 

81 
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The form of distribution of the aehievements of the indiyidiial 
approximates the normal surface of frequency. The mode is 
distinctly pronounced. The chief divergence from the normsl 
curve is skewness downward from the median. 

The average range in the achievements of the individual is 
4.78Q in terms of the Q of the group. The average range in 
achievements above the individual medians is 1.94Qy and the 
average range below the individual medians is 2.84Q. 

The lowest ranking pupils are the most variable in their 
aehievements. The variability of the second tertile is greater 
than the variability of the jSrst tertile, and the variability of 
the third tertile is greater than that of the second tertile in each 
testing. Measured by the Q of the group the average of the 
three testings shows that the variability of the highest tertile is 
.69Q, that of the middle tertile, -^SQ, and that of the lowest ter- 
tile, .99Q. 

The overlappings of the divisions of the group show marked 
amounts of difference between the median achievement of the 
different quartiles. In terms of the Q of the group the median 
achievement of the second quartile is .95Q lower than that of the 
first; that of the third, .65Q lower than that of the second; and 
the median achievement of the fourth is .84Q lower than that 
of the third. Therefore the pupils of this group who are most 
variable in their achievements are also distinctly lowest in 
achievements as measured by these tests. 

For the purpose of individual diagnosis it would be of ad- 
vantage to have more tests scaled from the zero point and stand- 
ardized in variability either by grade or by age of the pupil. 

C. Concerning extremely variable or erratic scores. 

Considering as erratic all scores at a distance of 3Q or more 
in each direction from the median score of the individual the 
average number of erratic scores for each testing is 4.5 per cent 
of the total number of scores for each testing. Twenty-nine per 
cent of the erratic scores are plus and 71 per cent are minus. 

Spelling caused more erratic scores than any other test with 
the exception of Algebra, Addition and Subtraction, which was 
in process of construction and which was given only once. In 
spelling and the three rate tests,— opposites, mixed relations, and 
easy directions — all but one of the erratic scores are minus. 
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These four tests contain 46 per cent of the erratic scores. In 
the remaining seven tests the total number of erratic scores plus 
is 30 and the total number minus is 29. 

There are 108 erratic scores in the three testings, — 2A per 
cent in the first, 35 per cent in the second, and 41 per cent in 
the third testing. In the first testing 35 per cent of the boys 
made one or more erratic scores; in the second testing, 42 per 
cent ; and in the third testing, 47 per cent. The distribution of 
erratic scores among the tertiles is as follows : 22 per cent are in 
the first tertile ; 29 per cent are in the second ; and 49 per cent 
are in the third tertile. Using the average number of boys who 
made erratic scores it is found that 24 per cent are in the first 
tertile; 32 per cent are in the second; and 44 per cent are in the 
third tertile. Therefore the results of this study show a notice- 
able increase in the number of pupils making erratic scores in 
the later testings and a slight increase in the number of erratic 
scores per pupil in the second and third testings. 

In this group of pupils 76 per cent made one or more erratic 
scores. Forty-two per cent made erratic scores in one testing 
only; 22 per cent in two testings; and 12 per cent in all three 
testings. Of the pupils who made erratic scores 55 per cent 
made them in the minus direction only ; 16 per cent in the plus 
direction only; and 29 per cent in both the plus and minus di- 
rections. No distinct types of variation are found in this group 
of pupils. 

A re-examination under closely controlled conditions of a few 
boys who made the most variable scores in spelling and in the 
rate tests produced an average reduction of 25 per cent in the 
number of erratic scores on the basis of the total number of 
scores in the re-examination. 

Five possible factors in the causation of erratic scores were 
studied. They are : the nature of the tests used, the administra- 
tion of the tests, accidental or unusual occurrences, statistical 
treatment of the results, and the ability of the individual in dif- 
ferent traits. The nature of the tests and the statistical treat- 
ment of the results seem to have magnified the variability of a 
relatively small number of scores. The administration of the 
tests in so far as it can be judged was an unimportant factor. 
Accidental or unusual occurrences probably caused a small pro- 
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I 

portion of the erratic scores. From the evidence of this study 
it appears that the ability of the pupil in different traits was 
the greatest factor in the causation of scores that varied 3Q or 
more from the individual's median. 

D. Concerning the relation between measures of ability, be- 
tween measures of variability, and between measures of ability 
and variability. 

The coefficients of correlation between the different testings 
show that from the results of the first testing the average achieve- 
ment of these boys in similar tests a year later and a year and a 
half later could be predicted with a rather high degree of ac- 
curacy. However, the results of the tests and the judgments 
of the teachers agree in showing a very great amount of change 
in the ranking of certain individuals among the group in the 
later testings. For the purpose of individual diagnosis the re- 
sults obtained from a single testing with such a group of tests 
should be considered as indices of individual ability which will 
be valid for varying lengths of time. Such results should be 
supplemented and checked by repetitions of the same or similar 
tests. School organization should be flexible enough to allow 
for a shifting among groups for instruction commensurate with 
the relative gain or loss in ability on the part of certain indi- 
viduals. 

The correlation between the first and third testings which 
are a year and a half apart in point of time is about 10 per cent 
less than the correlation between either the first and second 
or the second and third testings. This supplements the evidence 
already found showing that the pupils vary more in their 
achievements the longer they remain in school. 

The amount of correlation between measures of variability in 
the different testings, although small, is positive in all cases. 

The coefficient of correlation between composite ability by 
average rank in the three testings and composite variability by 
interquartile range (approximation) in the three testings is .55. 
This seems to indicate that there was a ctosiderable amount of 
relation between the ability of these pupils to achieve in these 
tests and the consistency or lack of variability in their achieve- 
ments. 



TABLE XXXVII 

DisTBiBvnos or Soores Above asd Below the Iniutiditai. Meduns in 

THE Eli:ven Tests Tsansmdtbd Into Mtn-xifUS or Q bt 

THE Original Dibtbibutions 

7%e Meagure of Central Tendency is the Median of the Individual'* Score* 

Tranamuted Into Mtiltiplee of Q. 
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TABLE XXXVni 

Distribution of Soobbs Above and Below the Indiyidual Medians 
Certain Tests Transmuted Into Multiples of Q bt 
the Original Distributions 
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The Measure of Central Tendency is the Median of the Individual's Scores 
Transmuted Into Multiples of Q. 
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* Score* are lacking for four individuals in both Trabne D and E. 
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TABLE XXXIX 
Obiginal Soobbs by Tests and by Individuals. Febbuabt, 1919 
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